magic-research / PLLaVA

Official repository for the paper PLLaVA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

No grid configuration in PLLaVA

farewellthree opened this issue · comments

Hello, thanks for the great work! It seems that the PLLaVA code does not utilize the grid configuration of Dynamic High Resolution from the LLaVA1.6. Is this because it would improve video performance, or is it simply to shorten sequence length and improve efficiency?

Looking forward to the response 👀

Hi,

Sorry for the late response.

Not sure whether the grid configuration would improve performance. We did not use dynamic high resolution is to save the computation budget. And also, we choose to fill up context length of the LLM with video frames but not frames' resolution.