Strange Model Loading Issue: Inconsistency with Vision Tower Parameters
shidingz opened this issue · comments
shidingz commented
I found something strange when loading the model. It seems that you have released the vision_tower during training, but when loading the vision_tower, you did not load the gradient-updating parameters, instead, you loaded the original vision tower (you loaded the model through the name mm_vision_tower). Can you please explain why this is happening?
Feng Li commented
Which model are you using?
Zihao Zheng commented
@FengLi-ust I'm using llama3-llava-next-8b
and almost all released models have same issue:
the released config.json loads vision_tower
from origin openai repo:
"mm_vision_tower": "openai/clip-vit-large-patch14-336",
Does that cause some difference to well-trained vision_tower
? Change the path of vision_tower to identify the problem:
"mm_vision_tower": "../pretrained_models/llama3-llava-next-8b",
model_path = "../pretrained_models/llama3-llava-next-8b"
model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True,
device_map=device_map,
attn_implementation='flash_attention_2', config=llava_cfg, )
then saw some warnings:
Some weights of CLIPVisionModel were not initialized from the model checkpoint at ../pretrained_models/llama3-llava-next-8b and are newly initialized: ['vision_model.embeddings.class_embedding',
'vision_model.embeddings.patch_embedding.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.0.layer_norm1.weight',
'vision_model.en
...