Cannot run llama example : access to source requires login credentials
dbrowne opened this issue · comments
cargo run --example llama --release
warning: some crates are on edition 2021 which defaults to resolver = "2"
, but virtual workspaces default to resolver = "1"
note: to keep the current resolver, specify workspace.resolver = "1"
in the workspace root's manifest
note: to use the edition 2021 resolver, specify workspace.resolver = "2"
in the workspace root's manifest
Finished release [optimized] target(s) in 0.17s
Running target/release/examples/llama
Running on CPU, to run on GPU, build this example with --features cuda
loading the model weights from meta-llama/Llama-2-7b-hf
Error: request error: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401
Caused by:
https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401
This is likely caused by the model being "gated": you have to accept some conditions before being able to access it, and you should be able to do so by registering on the huggingface hub, then accessing https://huggingface.co/meta-llama/Llama-2-7b-hf (and then you'll have to set up some authentication token so that permissioning will be checked).
pip install huggingface_hub
huggingface-cli login
For those who have already access to the Python ecosystem. Otherwise create a file in $HOME/.cache/huggingface/token
containing your HF token.
Closing this as hopefully it's all good with the instructions above.
Might be better to write it onto readme files.