--repeat-last-n option not mentioned in the usage help
ivanbaldo opened this issue · comments
The mandatory --repeat-last-n option isn't documented on the CLI usage message, like for example running without parameters or using the --help option.
Can you paste your CLI usage message? This is likely caused because the ModelSelected
command is separate from the outer args, as such cargo run -- --port 2000 llama7b --help
should help.
root@4be76ca848a1:/candle-vllm# /root/.cargo/bin/candle-vllm
Usage: candle-vllm [OPTIONS] --port <PORT> <COMMAND>
Commands:
llama7b Select the llama7b model
llama13b Select the llama13b model
llama70b Select the llama70b model
help Print this message or the help of the given subcommand(s)
Options:
--hf-token <HF_TOKEN> Huggingface token environment variable (optional). If not specified, load using hf_token_path
--hf-token-path <HF_TOKEN_PATH> Huggingface token file (optional). If neither `hf_token` or `hf_token_path` are specified this is used with the value of `~/.cache/huggingface/token`
--port <PORT> Port to serve on (localhost:port)
--verbose Set verbose mode (print all requests)
--max-num-seqs <MAX_NUM_SEQS> Maximum number of sequences to allow [default: 256]
--block-size <BLOCK_SIZE> Size of a block [default: 16]
-h, --help Print help
-V, --version Print version
root@4be76ca848a1:/candle-vllm#
root@4be76ca848a1:/candle-vllm# /root/.cargo/bin/candle-vllm llama7b
error: the following required arguments were not provided:
--repeat-last-n <REPEAT_LAST_N>
Usage: candle-vllm --port <PORT> llama7b --repeat-last-n <REPEAT_LAST_N>
For more information, try '--help'.
root@4be76ca848a1:/candle-vllm#
Now I understand: the options change depending on the model selected.
But then maybe the main Usage
help could guide the user that he should also consult the model specific options with candle-vllm <modelName> --help
too.
And the llama7b usage help could maybe explain what the --repeat-last-n
option means.
(btw I am just reporting this just in case, it may very well be low priority for the project at this time)
I just pushed a commit that adds:
- Information to the README to point the user towards model-specific information
- Explanation of
-repeat-last-n
Is this sufficient to close the issue?
Yeah of course!!! Thanks!!!
P.s.: isn't there a mostly suitable default value for that option? like for example 64?
Yes, there should be and I will likely add it later.