sp_token=32110

Question

sp_token=32110

xie-qiang opened this issue 8 months ago · comments

Hello, I found that the SP_TOKEN should be set to 32110 in Demo, otherwise the Image Token cannot be replaced, resulting in poor results, thank you!

outputs = model.generate(
        pixel_values = inputs['pixel_values'],
        input_ids = inputs['input_ids'],
        attention_mask = inputs['attention_mask'],
        img_mask = inputs['img_mask'],
        do_sample=False,
        max_length=50,
        min_length=1,
        set_min_padding_size =False,
        sp_token = 32110
)

Jianzhao Huang · Answer 1 · Thu May 09 2024 01:55:14 GMT+0800 (China Standard Time)

Hello, I tried the method you mentioned, but encountered an error. Do you have any suggestions? Thank you very much!

Here is the complete error information.

shape mismatch leads to truncate. insert embedding tensor of shape torch.Size([96, 4096]) cannot be broadcast to replace placeholder of shape torch.Size([0, 4096])

{
	"name": "RuntimeError",
	"message": "torch.cat(): expected a non-empty list of Tensors",
	"stack": "---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[10], line 17
     14 inputs['pixel_values'] = inputs['pixel_values'].unsqueeze(0)
     16 inputs = inputs.to('cuda:0')
---> 17 outputs = model.generate(
     18         pixel_values = inputs['pixel_values'],
     19         input_ids = inputs['input_ids'],
     20         attention_mask = inputs['attention_mask'],
     21         img_mask = inputs['img_mask'],
     22         do_sample=False,
     23         max_length=50,
     24         min_length=1,
     25         set_min_padding_size =False,
     26         sp_token = 32110
     27 )
     28 generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip()
     29 print(generated_text)

File ~/anaconda3/envs/mmicl/lib/python3.8/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/hjz/harmful_meme_detection/mmicl/model/instructblip/modeling_instructblip.py:2129, in InstructBlipForConditionalGeneration.generate(self, pixel_values, qformer_input_ids, qformer_attention_mask, input_ids, attention_mask, img_mask, set_min_padding_size, sp_token, **generate_kwargs)
   2126         index+= i_count*img_token_szie
   2127     img_idx +=1
-> 2129 insert_embeds = torch.concat(insert_embeds_list, dim=0)
   2130 try:
   2131     inputs_embeds[image_embeds_index] = insert_embeds

RuntimeError: torch.cat(): expected a non-empty list of Tensors"
}