Recaption bug.
guyuchao opened this issue · comments
Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.
If I set the conv_mode=plain instead of eval_recaption in infer_caption, the model can output answer correctly. It seems the system prompt not function correctly. Do you have any idea to fix this bug?
Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.
In the print_res, we've printed out everything that the Language Model outputs at LM OUTPUT TEXT (Including the prompting and user query). This was to monitor the correctness of interence. The real answer should be at the end of the LM output text, probably after "ASSISTANT:". Could you see that answer.
Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.
Correct output by single_test:
Incorrect output by infer_recaption:In the print_res, we've printed out everything that the Language Model outputs at LM OUTPUT TEXT (Including the prompting and user query). This was to monitor the correctness of interence. The real answer should be at the end of the LM output text, probably after "ASSISTANT:". Could you see that answer.
No answer in LM output text except some repeat of system prompt.
Another bug I find is in your eval.sh (7B), you don't pass the conv_mode argument. So different setting use the same default conv_mode="eval_videoqabench".
Hi,
Correct, we can pass in the conv_mode here, but we also have a default conv mode for each script. So it would be fine using default conv_mode for evaluating 7b and 13b model. But for 34b which uses another prompting, gotta use yi prompting.
might be a bug then, I'll look into this.