LuJunru / MemoChat

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MemoChat

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-domain Conversation

Environment

We provide core_requirement.txt for your convenience.

Model Weights

The initial models we used are fastchat models (v1.3). Below are the model weights of our fine-tuned version. Our models are built upon Fastchat modles, thus we adopt same cc-by-nc-sa-4.0 license.

Name Share Link
MemoChat-Fastchat-T5-3B https://huggingface.co/Junrulu/MemoChat-Fastchat-T5-3B
MemoChat-Vicuna-7B https://huggingface.co/Junrulu/MemoChat-Vicuna-7B
MemoChat-Vicuna-13B https://huggingface.co/Junrulu/MemoChat-Vicuna-13B
MemoChat-Vicuna-33B https://huggingface.co/Junrulu/MemoChat-Vicuna-33B

Workflow

RootPath is the absolute path of this repo. Download initial models and put them in model folder.

Instruction Tuning

Run `bash code/scripts/tuning.sh RootPath`. Intermediate evaluation are included in this script as well.

MemoChat Testing

Run `bash code/scripts/memochat.sh RootPath` for pipeline testing with fine-tuned models. 
Run `bash code/scripts/memochat_gpt.sh RootPath` for pipeline testing with GPT3.5 API.
Run `bash code/scripts/llm_judge.sh RootPath` for GPT4 judge (openai api is required).

Our Results

We provide our prediction results here.

Acknowledgement

We thank Vicuna project for their great work.

Citation

@misc{lu2023memochat,
      title={MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation}, 
      author={Junru Lu and Siyu An and Mingbao Lin and Gabriele Pergola and Yulan He and Di Yin and Xing Sun and Yunsheng Wu},
      year={2023},
      eprint={2308.08239},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

License:MIT License


Languages

Language:Python 93.1%Language:Shell 6.9%