imoneoi / openchat

OpenChat: Advancing Open-source Language Models with Imperfect Data

Home Page:https://openchat.team

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot create data file

phuvinhnguyen opened this issue · comments

I am trying to train vinallama model but having a problem.

python hf_add_tokens.py --model-path vilm/vinallama-2.7b --output-dir ./vinallama --added-special-tokens "<|end_of_turn|>" "<|pad_0|>"
python -m ochat.data.generate_dataset --model-type openchat_v3.2 --model-path ./vinallama --in-files ./data.txt --out-prefix ./data/llama2_tokenize

Cause: TypeError: expected str, bytes or os.PathLike object, not NoneType

However, if I use

python -m ochat.data.generate_dataset --model-type openchat_v3.2 --model-path imone/LLaMA2_7B_with_EOT_token --in-files ./data.txt --out-prefix ./data/llama2_tokenize

then it works just fine

Did I do something wrong?

Can u send data.txt file?