--repeat-last-n option not mentioned in the usage help

Question

--repeat-last-n option not mentioned in the usage help

ivanbaldo opened this issue 8 months ago · comments

The mandatory --repeat-last-n option isn't documented on the CLI usage message, like for example running without parameters or using the --help option.

Eric Buehler · Answer 1 · Thu Feb 08 2024 02:52:50 GMT+0800 (China Standard Time)

Can you paste your CLI usage message? This is likely caused because the ModelSelected command is separate from the outer args, as such cargo run -- --port 2000 llama7b --help should help.

Iván Baldo · Answer 2 · Thu Feb 08 2024 03:41:44 GMT+0800 (China Standard Time)

root@4be76ca848a1:/candle-vllm# /root/.cargo/bin/candle-vllm
Usage: candle-vllm [OPTIONS] --port <PORT> <COMMAND>

Commands:
  llama7b   Select the llama7b model
  llama13b  Select the llama13b model
  llama70b  Select the llama70b model
  help      Print this message or the help of the given subcommand(s)

Options:
      --hf-token <HF_TOKEN>            Huggingface token environment variable (optional). If not specified, load using hf_token_path
      --hf-token-path <HF_TOKEN_PATH>  Huggingface token file (optional). If neither `hf_token` or `hf_token_path` are specified this is used with the value of `~/.cache/huggingface/token`
      --port <PORT>                    Port to serve on (localhost:port)
      --verbose                        Set verbose mode (print all requests)
      --max-num-seqs <MAX_NUM_SEQS>    Maximum number of sequences to allow [default: 256]
      --block-size <BLOCK_SIZE>        Size of a block [default: 16]
  -h, --help                           Print help
  -V, --version                        Print version
root@4be76ca848a1:/candle-vllm#

Iván Baldo · Answer 3 · Thu Feb 08 2024 03:42:54 GMT+0800 (China Standard Time)

root@4be76ca848a1:/candle-vllm# /root/.cargo/bin/candle-vllm llama7b
error: the following required arguments were not provided:
  --repeat-last-n <REPEAT_LAST_N>

Usage: candle-vllm --port <PORT> llama7b --repeat-last-n <REPEAT_LAST_N>

For more information, try '--help'.
root@4be76ca848a1:/candle-vllm#

Iván Baldo · Answer 4 · Thu Feb 08 2024 03:44:56 GMT+0800 (China Standard Time)

Now I understand: the options change depending on the model selected.

But then maybe the main Usage help could guide the user that he should also consult the model specific options with candle-vllm <modelName> --help too.

And the llama7b usage help could maybe explain what the --repeat-last-n option means.

Iván Baldo · Answer 5 · Thu Feb 08 2024 03:45:23 GMT+0800 (China Standard Time)

(btw I am just reporting this just in case, it may very well be low priority for the project at this time)

Eric Buehler · Answer 6 · Thu Feb 08 2024 03:59:08 GMT+0800 (China Standard Time)

I just pushed a commit that adds:

Information to the README to point the user towards model-specific information
Explanation of -repeat-last-n

Is this sufficient to close the issue?

Iván Baldo · Answer 7 · Thu Feb 08 2024 04:21:56 GMT+0800 (China Standard Time)

Yeah of course!!! Thanks!!!
P.s.: isn't there a mostly suitable default value for that option? like for example 64?

Eric Buehler · Answer 8 · Thu Feb 08 2024 04:22:41 GMT+0800 (China Standard Time)

Yes, there should be and I will likely add it later.