Recaption bug.

Question

Recaption bug.

guyuchao opened this issue a month ago · comments

Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.

Correct output by single_test:

Incorrect output by infer_recaption:

Yuchao Gu · Answer 1 · Thu May 02 2024 21:57:13 GMT+0800 (China Standard Time)

this is my inference scripts:

Yuchao Gu · Answer 2 · Thu May 02 2024 22:10:16 GMT+0800 (China Standard Time)

If I set the conv_mode=plain instead of eval_recaption in infer_caption, the model can output answer correctly. It seems the system prompt not function correctly. Do you have any idea to fix this bug?

Yuchao Gu · Answer 3 · Thu May 02 2024 22:29:09 GMT+0800 (China Standard Time)

Another bug I find is in your eval.sh (7B), you don't pass the conv_mode argument. So different setting use the same default conv_mode="eval_videoqabench".

ermu2001 · Answer 4 · Thu May 02 2024 23:30:03 GMT+0800 (China Standard Time)

Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.

Correct output by single_test:

Incorrect output by infer_recaption:

In the print_res, we've printed out everything that the Language Model outputs at LM OUTPUT TEXT (Including the prompting and user query). This was to monitor the correctness of interence. The real answer should be at the end of the LM output text, probably after "ASSISTANT:". Could you see that answer.

Yuchao Gu · Answer 5 · Thu May 02 2024 23:34:16 GMT+0800 (China Standard Time)

Hi, I try the recaption in eval.sh. In the line 220, single_test, the model accurately output the caption, we show in the figure. But in captioning the Inter4K, line 134, the model's output just repeat the system prompt. I don't modify any part of given code, just to set print_res=True.
Correct output by single_test:
Incorrect output by infer_recaption:

In the print_res, we've printed out everything that the Language Model outputs at LM OUTPUT TEXT (Including the prompting and user query). This was to monitor the correctness of interence. The real answer should be at the end of the LM output text, probably after "ASSISTANT:". Could you see that answer.

No answer in LM output text except some repeat of system prompt.

ermu2001 · Answer 6 · Thu May 02 2024 23:34:42 GMT+0800 (China Standard Time)

Another bug I find is in your eval.sh (7B), you don't pass the conv_mode argument. So different setting use the same default conv_mode="eval_videoqabench".

Hi,

Correct, we can pass in the conv_mode here, but we also have a default conv mode for each script. So it would be fine using default conv_mode for evaluating 7b and 13b model. But for 34b which uses another prompting, gotta use yi prompting.

ermu2001 · Answer 7 · Thu May 02 2024 23:52:00 GMT+0800 (China Standard Time)

might be a bug then, I'll look into this.

ermu2001 · Answer 8 · Fri May 03 2024 00:56:22 GMT+0800 (China Standard Time)

fixed at #15, check out whether it correctly fix this problem.

Yuchao Gu · Answer 9 · Fri May 03 2024 08:11:51 GMT+0800 (China Standard Time)

fixed at #15, check out whether it correctly fix this problem.

Fixed, thank you.