OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not attention mask is input for training

double-fire-0 opened this issue · comments

From the code from huggingface minicpmv, I notice that the attention_mask is None when calling llama3.forward function.

It will work fine when setting batch size to 1, but it seems that is not suitable for when batch size > 1.