artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Home Page:https://arxiv.org/abs/2305.14314

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does finetuning need to follow the Llama 2 system prompt format?

Zheng392 opened this issue · comments

The finetuning Llama 2 example uses the oasst1 dataset with the "### Human: ... ### Assistant: " system prompt format. However, Llama 2 uses the following prompt format:

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]

Would it be a problem if we use a different system prompt format to fine-tune like the oasst1 case?

And what is the difference between using the Llama-2-7b-hf and Llama-2-7b-chat-hf to fine-tune?

The base model hasn't been fine-tuned at all, so you can use whatever format you want.

However, keep in mind that if you do fine-tune the base model on a different chat format, the library you use to generate completions needs to follow whichever format you used. This could be a problem, for instance, if you use HF Transformers conversational pipeline. The conversational pipeline looks for the _build_conversation_input_ids method on the model's tokenizer. If it finds it, it uses that to format and tokenize the dialogue. The llama tokenizer formats conversations in the above format. So if you use a different format in your fine tune and then use HF's llama tokenizer and conversational pipeline, you'll have problems.