Support llama-3
boixu opened this issue · comments
Hi
Please add support for llama-3
Currently the prompt template is not compatible since llama-3 uses different style.
Ref: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3
Currently as is I was unable to use the llama-3 model.
Thanks in advance!
h i tried llama-3 and may be you can use the setup.
code is little dirty.
first add template for llama3 in file.
prompt_template_utils.py
def get_prompt_template(system_prompt=system_prompt, promptTemplate_type=None, history=False):
if promptTemplate_type == "llama3":
if history:
prompt = PromptTemplate(
template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
Context: {history} \n {context}
User: {question}
Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
input_variables=["history", "context", "question"],
)
else:
prompt = PromptTemplate(
template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
Context: {context}
User: {question}
Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
input_variables=["context", "question"],
)
elif promptTemplate_type == "llama":
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
SYSTEM_PROMPT = B_SYS + system_prompt + E_SYS
then add option for choosing the llama3 in localGPT
run_localGPT.py
@click.option(
"--model_type",
default="llama",
type=click.Choice(
["llama", "mistral", "non_llama", "llama3"],
),
help="model type, llama, mistral or non_llama, or llama3",
)
you can run now with python run_localGPT.py --model_type llama3
here is the model i used for tesitng.
constants.py
# LLAMA 3
MODEL_ID = "unsloth/llama-3-8b-bnb-4bit"
MODEL_BASENAME = None
@toomy0toons did you upgrade the llama cpp or transformers version to make this work with llama-3?
I did install llama cpp
by the readme docs.
i have cuda GPU so i installed the cublas version.
# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since requirements.txt
does not specify a version,
and i installed yesterday my verions might be a more recent one.
my transformers is transformers==4.38.2
now.
@KerenK-EXRM
is there a problem running llama3?
I think since llama2 is probably not going to be used anymore, I will update the prompt template for llama3 as default template.
@toomy0toons I tried with another version( QuantFactory/Meta-Llama-3-8B-GGUF) and it did't work.
looks like the project adjusted to support llama3
thank you! cant wait to try :)
hi i have downloaded llama3 70b model . can some one provide me steps to convert into hugging face model and then run in the localGPT as currently i have done the same for llama 70b i am able to perform but i am not able to convert the full model files to .hf format files. so i would request for an proper steps in how i can perform. please let me know guys any steps please let me know. thank you
I did install
llama cpp
by the readme docs.i have cuda GPU so i installed the cublas version.
# Example: cuBLAS CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since
requirements.txt
does not specify a version, and i installed yesterday my verions might be a more recent one. my transformers istransformers==4.38.2
now. @KerenK-EXRM is there a problem running llama3?
Hi @toomy0toons , trying to do the same but having some issues as per this #793
my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.
try 7b models?
my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.
try 7b models?
No 7B models for llama3 (https://adithyask.medium.com/from-7b-to-8b-parameters-understanding-weight-matrix-changes-in-llama-transformer-models-31ea7ed5fd88)
Do you mean none of the embedding models in constants.py are ok to run any of the llama-3 8b models?
@toomy0toons found out the answer here https://youtu.be/S6PdFPoteBU?si=pSsxCNFJsz_dxn8b&t=551