Multiple frames from video

Question

Multiple frames from video

Tsardoz opened this issue a month ago · comments

Does it work with multiple frames?
I tried reading sequential frames froma folder, converting to base64 and appending but I get an error when using chat_model.chat(inputs). Is this supported?
test_video.txt

Han Cheng · Answer 1 · Sat Jun 08 2024 21:17:21 GMT+0800 (China Standard Time)

I have the same issue. I tried to feed the model multiple images, and the answer I got was "image encoder error". I look at the code of chat.py and found that the chat method in the MiniCPMV class only accepts a single image. I am also curious whether the model has the ability to read multiple images at the same time for conversation like GPT4.

Cui Junbo · Answer 2 · Fri Jun 14 2024 08:31:20 GMT+0800 (China Standard Time)

hi, this is a very good try. it is capable of inputting multiple images. But of course, it wasn't trained on video scenarios, which leads to the fact that he may not be very good. You can have a try.
please refer to this link
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/2