quetions about the evals
CrossLee1 opened this issue · comments
Thanks for your great work!
I have some questions about the evaluation results.
- TextVQA, results of llava-1.5-13B get score of 61.3 in the paper, but in the sheet https://docs.google.com/spreadsheets/d/1a5ImfdKATDI8T7Cwh6eH-bEsnQFzanFraFUgcS9KHWc/edit#gid=0, the result is only 48.73, why?
- After executing the command as README, the generated submission file of mmbench seems wrong. When I upload it to the evaluation server, I got the logs "Your excel file should have a column named
A
, please double check and submit again"
hope to get your response, thanks ~
I think it's because llava reports the test
result meanwhile we are reporting val
split.
Running textvqa will get both val metric result and a submission file for test split.
And users may submit the file to https://eval.ai/web/challenges/challenge-page/874/ to get the results.
Maybe @pufanyi could address this more clearly.
Hello! Thank you for your interest in our work! Regarding question 1, as referenced in the LLaVA evaluation code here, the OCR token was utilized as a prompt for evaluation.
However, in our evaluation of LLaVA, we did not incorporate the OCR token:
To achieve results consistent with the LLaVA paper, you can set this to true
.