No grid configuration in PLLaVA

Question

No grid configuration in PLLaVA

farewellthree opened this issue a month ago · comments

Hello, thanks for the great work! It seems that the PLLaVA code does not utilize the grid configuration of Dynamic High Resolution from the LLaVA1.6. Is this because it would improve video performance, or is it simply to shorten sequence length and improve efficiency?

farewellthree · Answer 1 · Wed May 08 2024 14:53:53 GMT+0800 (China Standard Time)

Looking forward to the response 👀

ermu2001 · Answer 2 · Fri May 10 2024 21:38:26 GMT+0800 (China Standard Time)

Hi,

Sorry for the late response.

Not sure whether the grid configuration would improve performance. We did not use dynamic high resolution is to save the computation budget. And also, we choose to fill up context length of the LLM with video frames but not frames' resolution.