Not able to reproduce results of llama-adapter-v2
dmlpt opened this issue · comments
Hi,
I am trying to reproduce the results of llama-adapter v2. I am finetuning the model with "alpaca_gpt4_data" and "llava_instruct_150k" datasets and using the settings from https://github.com/OpenGVLab/LLaMA-Adapter/blob/a50befee3fdde8a08ca346b2ec70407e59ff6536/llama_adapter_v2_multimodal7b/exps/finetune.sh
I used the pre-trained model from https://huggingface.co/Cxxs/ImageBind-LLM/resolve/main/7B-pretrained.pth for finetuning.
When I evaluated the model using https://github.com/OpenGVLab/LLaMA-Adapter/blob/a50befee3fdde8a08ca346b2ec70407e59ff6536/llama_adapter_v2_multimodal7b/util/evaluate_mme.py [setting all three w_bias, w_lora, w_new_gate to FALSE, The load_state_dict does not show any missing keys after loading], I get the below result (random chance):
=========== Perception ===========
total score: 497.44607843137254
existence score: 50.0
count score: 50.0
position score: 50.0
color score: 50.0
posters score: 66.66666666666667
celebrity score: 28.52941176470588
scene score: 50.0
landmark score: 50.0
artwork score: 52.25
OCR score: 50.0
=========== Cognition ===========
total score: 248.57142857142858
commonsense_reasoning score: 53.57142857142858
numerical_calculation score: 50.0
text_translation score: 95.0
code_reasoning score: 50.0
Am I making any mistake in evaluating the model?
Note: I am getting reported results when I use the model downloaded from https://github.com/OpenGVLab/LLaMA-Adapter/releases/download/v.2.1.0/427dbc27bf62a3ef7a24ffd3ed2c3162_LORA-BIAS-7B-v21.pth
Thanks in Advance!
Can you share your training log?