Encountered an error while running the demo: the data type (dtype) of the llm does not match the data type of the vision model.

Question

Encountered an error while running the demo: the data type (dtype) of the llm does not match the data type of the vision model.

gaowei724 opened this issue a month ago · comments

Thank you for your outstanding work and open-source code. I encountered an error while trying to run the demo following the instructions for "run application". The error message is as follows:

...
    final_embedding[image_to_overwrite] = image_features.contiguous().reshape(-1, embed_dim).to(target_device)
RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and BFloat16 for the source.
...

After checking, I found that the error was caused by a dtype mismatch between the llm and the vision model. Specifically, I used the llava-hf/llava-v1.6-vicuna-7b-hf model you shared on Hugging Face and I ran the command bash scripts/demo.sh MODELS/llava-v1.6-vicuna-7b-hf/ MODELS/llava-v1.6-vicuna-7b-hf/, but I got the above error. Could you advise how to adjust the settings to make the dtypes of the llm and vision model consistent? Thank you.

ermu2001 · Answer 1 · Fri May 10 2024 21:26:03 GMT+0800 (China Standard Time)

For the second argument here, you should provide the parameters weight directory.

follow this to prepare the weights and pass in MODELS/pllava-7b instead.

gaowei · Answer 2 · Sat May 11 2024 09:59:02 GMT+0800 (China Standard Time)

For the second argument here, you should provide the parameters weight directory.

follow this to prepare the weights and pass in MODELS/pllava-7b instead.

Thank you!