OSError: Not found: "checkpoints/lit-llama/tokenizer.model": No such file or directory Error #2
anirudhitagi opened this issue · comments
Where to get the tokenizer.model file?
I have been following the instructions given here - https://github.com/Lightning-AI/lit-llama/blob/main/howto/train_redpajama.md
when I run
python scripts/prepare_redpajama.py --source_path data/RedPajama-Data-1T-Sample --tokenizer_path checkpoints/lit-llama/tokenizer.model --destination_path data/lit-redpajama-sample --sample True
I get the error -
OSError: Not found: "checkpoints/lit-llama/tokenizer.model": No such file or directory Error #2
Good question, usually it comes with the model you downloaded via the python download.py ...
script
could you please point me towards the python download.py ...
script and a reference commmand?
Sure, for example, you can run
scripts/download.py --repo_id openlm-research/open_llama_7b --local_dir checkpoints/open-llama/7B
as described here, which will download the weights and create the checkpoint files, including tokenizer.model
.
The download.py
script is in the ./scripts/
subdirectory. Please let me know if you bump into issues or have questions.
That worked! Thank you so much