python main.py

Question

python main.py

lqcStar opened this issue 7 months ago · comments

Where do we get the six parameters：arxiv_path, news_path, conversation_path, save_to_path, num_articles, model_name = sys.argv[1:]

liyucheng09 · Answer 1 · Wed Dec 27 2023 03:27:20 GMT+0800 (China Standard Time)

arxiv_dir = /mnt/fast/nobackup/users/yl02706/llm_memorize/arxiv/500arxiv
news_dir = /mnt/fast/nobackup/users/yl02706/llm_memorize/news
conversation_dir = /mnt/fast/nobackup/users/yl02706/llm_memorize/conversation

these are places to load and cache dataset.

You can find download links for the three datasets in readme.

save_to_path the path you save your experimental results.

liyucheng09 · Answer 2 · Wed Dec 27 2023 03:31:04 GMT+0800 (China Standard Time)

I will update main.py shortly to provide more implementation details. Stay tuned.

liyucheng09 · Answer 3 · Wed Dec 27 2023 03:44:46 GMT+0800 (China Standard Time)

@lqcStar Check the updated reproduction guideline. Let me know if you have any further quesiton.

lqcStar · Answer 4 · Wed Dec 27 2023 12:03:30 GMT+0800 (China Standard Time)

Thanks for you reply, it's my pleasure, but when i unzip the datasets_dumps.zip, it contained three directories :arxiv,conversion and news. Arxiv contains 500 json files, conversion conversion contains an empty json file, news is an empty directory.

And how is the last model name parameter specified and should it be downloaded to the server, num_articles is set to what is appropriate.

liyucheng09 · Answer 5 · Wed Dec 27 2023 18:08:16 GMT+0800 (China Standard Time)

try again with same code.

Now conversion is not empty anymore.

news is intented to be empty. so no problem.

num_articles set to 300.

model name is any huggingface model, like huggyllama/llama-7b.

lqcStar · Answer 6 · Wed Dec 27 2023 19:33:05 GMT+0800 (China Standard Time)

Thank you very much for your answer. My problem is solved,
Best wishes!