Demo deployment issues

Question

Demo deployment issues

pedrocolon93 opened this issue a month ago · comments

Hi there! I clone the tensors here:
git clone https://huggingface.co/lmms-lab/llava-next-interleave-7b
And I do the setup as is in the readme (which needs an upgrade for gradio (pip install --upgrade gradio) and needs numpy==1.23.0) and when i do inference in gradio (with the examples) I get garbage.
Is there anything I am missing?

Pedro · Answer 1 · Sat Jun 29 2024 07:23:41 GMT+0800 (China Standard Time)

As a side note also needs to install flash attention: pip install flash-attn

Pedro · Answer 2 · Sat Jun 29 2024 07:24:01 GMT+0800 (China Standard Time)

As a second side note same thing for the -dpo model.

Pedro · Answer 3 · Sat Jun 29 2024 08:22:47 GMT+0800 (China Standard Time)

FIxed this by cloning the repo and adding -qwen- in the name of the repo... Otherwise it loads some other Llava architecture which does not work.

Pedro · Answer 4 · Sat Jun 29 2024 08:26:50 GMT+0800 (China Standard Time)

If loading in 4 bit, the line in builder.py
kwargs["load_in_4bit"] = True
needs to be commented.

Hao Zhang · Answer 5 · Tue Jul 02 2024 07:09:56 GMT+0800 (China Standard Time)

You should change the model path from llava-next-interleave-7b to llava-next-interleave-qwen-7b and try again.

Pedro · Answer 6 · Tue Jul 02 2024 07:39:23 GMT+0800 (China Standard Time)

Thanks, and double check the:
If loading in 4 bit, the line in builder.py
kwargs["load_in_4bit"] = True needs to be commented
and adding in the flash-attn dependency