TypeError, IndexError, ValueError occured

Question

TypeError, IndexError, ValueError occured

UESENPAI opened this issue 2 years ago · comments

Greeting, Thank you for your nice transformer visualization code

I am trying to visualize my model but stucked in some error

model_name = 'vit_base_patch16_224'
model = timm.create_model(model_name, pretrained=True, num_classes=3)

I trained my model and use your libaries
I have put vit_explain.py, vit_grad_rollout.py, vit_rollout.py files into the same directory of my main python file

and typed as below

from vit_grad_rollout import VITAttentionGradRollout
grad_rollout = VITAttentionGradRollout(model, discard_ratio=0.9, head_fusion='max')
mask = grad_rollout(input_tensor = inputs, category_index=243)

and I got the error below

TypeError Traceback (most recent call last)
in <cell line: 4>()
2 grad_rollout = VITAttentionRollout(model, discard_ratio = 0.8, head_fusion='max')
3 new_image = class_images[0].unsqueeze(0)
----> 4 mask = grad_rollout(input_tensor = new_image, category_index=3)

TypeError: VITAttentionRollout.call() got an unexpected keyword argument 'category_index'

I have checked VITAttentionRollout.call() but this is well defined and I didn`t changed anything...

def call(self, input_tensor, category_index):

I don`t know why this error happens

to solve the error, I have overided your VITAttentionRollout.call() like below
def call(self, input_tensor, category_index=3):

and changed mask code

mask = grad_rollout(input_tensor = inputs)

then, typeerror is solved but valueerror came out

ValueError Traceback (most recent call last)
in <cell line: 4>()
2 grad_rollout = VITAttentionRollout(model, discard_ratio = 0.8, head_fusion='max')
3 new_image = class_images[0]
----> 4 mask = grad_rollout(input_tensor = new_image)

5 frames
in call(self, input_tensor)
17 self.attentions = []
18 with torch.no_grad():
---> 19 output = self.model(input_tensor)
20
21 return rollout(self.attentions, self.discard_ratio, self.head_fusion)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/timm/models/vision_transformer.py in forward(self, x)
650
651 def forward(self, x):
--> 652 x = self.forward_features(x)
653 x = self.forward_head(x)
654 return x

/usr/local/lib/python3.10/dist-packages/timm/models/vision_transformer.py in forward_features(self, x)
631
632 def forward_features(self, x):
--> 633 x = self.patch_embed(x)
634 x = self._pos_embed(x)
635 x = self.patch_drop(x)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/timm/layers/patch_embed.py in forward(self, x)
67
68 def forward(self, x):
---> 69 B, C, H, W = x.shape
70 if self.img_size is not None:
71 if self.strict_img_size:

ValueError: not enough values to unpack (expected 4, got 3)

my inputs size is [3, 224, 224]

It seems some problem with tensor size, to solve, I have used unsqueeze(0) to inputs

inputs = inputs.unsqueeze(0)

and I got the error below

IndexError Traceback (most recent call last)
in <cell line: 4>()
2 grad_rollout = VITAttentionRollout(model, discard_ratio = 0.8, head_fusion='max')
3 new_image = class_images[0].unsqueeze(0)
----> 4 mask = grad_rollout(input_tensor = new_image)

1 frames
in call(self, input_tensor)
19 output = self.model(input_tensor)
20
---> 21 return rollout(self.attentions, self.discard_ratio, self.head_fusion)

in rollout(attentions, discard_ratio, head_fusion)
32 # Look at the total attention between the class token,
33 # and the image patches
---> 34 mask = result[0, 0 , 1 :] # [196,196]
35 # In case of 224x224 image, this brings us from 196 to 14
36 width = int(mask.size(-1)**0.5)

IndexError: too many indices for tensor of dimension 2

What whas the your inputs size?