diff --git a/README.md b/README.md index 69be00ef6..6b8075068 100644 --- a/README.md +++ b/README.md @@ -74,7 +74,7 @@ The general syntax is: `python3 -m garak --list_probes` -To specify a generator, use the `--model_name` and, optionally, the `--model_type` options. Model name specifies a model family/interface; model type specifies the exact model to be used. The "Intro to generators" section below describes some of the generators supported. A straightfoward generator family is Hugging Face models; to load one of these, set `--model_name` to `huggingface` and `--model_type` to the model's name on Hub (e.g. `"RWKV/rwkv-4-169m-pile"`). Some generators might need an API key to be set as an environment variable, and they'll let you know if they need that. +To specify a generator, use the `--model_type` and, optionally, the `--model_name` options. Model type specifies a model family/interface; model name specifies the exact model to be used. The "Intro to generators" section below describes some of the generators supported. A straightfoward generator family is Hugging Face models; to load one of these, set `--model_type` to `huggingface` and `--model_name` to the model's name on Hub (e.g. `"RWKV/rwkv-4-169m-pile"`). Some generators might need an API key to be set as an environment variable, and they'll let you know if they need that. `garak` runs all the probes by default, but you can be specific about that too. `--probes promptinject` will use only the [PromptInject](https://github.com/agencyenterprise/promptinject) framework's methods, for example. You can also specify one specific plugin instead of a plugin family by adding the plugin name after a `.`; for example, `--probes lmrc.SlurUsage` will use an implementation of checking for models generating slurs based on the [Language Model Risk Cards](https://arxiv.org/abs/2303.18190) framework. @@ -116,46 +116,46 @@ Send PRs & open issues. Happy hunting! ### huggingface -* `--model_name huggingface` (for transformers models to run locally) -* `--model_type` - use the model name from Hub. Only generative models will work. If it fails and shouldn't, please open an issue and paste in the command you tried + the exception! +* `--model_type huggingface` (for transformers models to run locally) +* `--model_name` - use the model name from Hub. Only generative models will work. If it fails and shouldn't, please open an issue and paste in the command you tried + the exception! -* `--model_name huggingface.InferenceAPI` (for API-based model access) -* `--model_type` - the model name from Hub, e.g. `"mosaicml/mpt-7b-instruct"` +* `--model_type huggingface.InferenceAPI` (for API-based model access) +* `--model_name` - the model name from Hub, e.g. `"mosaicml/mpt-7b-instruct"` * (optional) set the `HF_INFERENCE_TOKEN` environment variable to a Hugging Face API token with the "read" role; see https://huggingface.co/settings/tokens when logged in ### openai -* `--model_name openai` -* `--model_type` - the OpenAI model you'd like to use. `text-babbage-001` is fast and fine for testing; `gpt-4` seems weaker to many of the more subtle attacks. +* `--model_type openai` +* `--model_name` - the OpenAI model you'd like to use. `text-babbage-001` is fast and fine for testing; `gpt-4` seems weaker to many of the more subtle attacks. * set the `OPENAI_API_KEY` environment variable to your OpenAI API key (e.g. "sk-19763ASDF87q6657"); see https://platform.openai.com/account/api-keys when logged in Recognised model types are whitelisted, because the plugin needs to know which sub-API to use. Completion or ChatCompletion models are OK. If you'd like to use a model not supported, you should get an informative error message, and please send a PR / open an issue. ### replicate -* `--model_name replicate` -* `--model_type` - the Replicate model name and hash, e.g. `"stability-ai/stablelm-tuned-alpha-7b:c49dae36"` +* `--model_type replicate` +* `--model_name` - the Replicate model name and hash, e.g. `"stability-ai/stablelm-tuned-alpha-7b:c49dae36"` * set the `REPLICATE_API_TOKEN` environment variable to your Replicate API token, e.g. "r8-123XXXXXXXXXXXX"; see https://replicate.com/account/api-tokens when logged in ### cohere -* `--model_name cohere` -* `--model_type` (optional, `command` by default) - The specific Cohere model you'd like to test +* `--model_type cohere` +* `--model_name` (optional, `command` by default) - The specific Cohere model you'd like to test * set the `COHERE_API_KEY` environment variable to your Cohere API key, e.g. "aBcDeFgHiJ123456789"; see https://dashboard.cohere.ai/api-keys when logged in ### ggml -* `--model_name ggml` -* `--model_type` - The path to the ggml model you'd like to load, e.g. `/home/leon/llama.cpp/models/7B/ggml-model-q4_0.bin` +* `--model_type ggml` +* `--model_name` - The path to the ggml model you'd like to load, e.g. `/home/leon/llama.cpp/models/7B/ggml-model-q4_0.bin` * set the `GGML_MAIN_PATH` environment variable to the path to your ggml `main` executable ### test -* `--model_name test` +* `--model_type test` * (alternatively) `--model_name test.Blank` For testing. This always generates the empty string, using the `test.Blank` generator. Will be marked as failing for any tests that *require* an output, e.g. those that make contentious claims and expect the model to refute them in order to pass. -* `--model_name test.Repeat` +* `--model_type test.Repeat` For testing. This generator repeats back the prompt it received. ## Intro to probes