Cannot use block.attn.fused_attn = False in another ViT model

Question

Cannot use block.attn.fused_attn = False in another ViT model

kristosh opened this issue a year ago · comments

I am trying to run the code for another ViT model, and more specifically:

    #  Get pretrained weights for ViT-Base
    retrained_vit_weights = torchvision.models.ViT_B_16_Weights.DEFAULT # requires torchvision >= 0.13, "DEFAULT" means best available
    pretrained_vit = torchvision.models.vit_b_16(weights=retrained_vit_weights).to(device)
    
    #pretrained_vit_1 = torch.hub.load('facebookresearch/deit:main', 'deit_tiny_patch16_224', pretrained=True).to(device)

In this model I have noticed that I cannot use the following code:

for block in model.blocks:
            block.attn.fused_attn = False

Since the model does not have the same structure as deit_tiny_patch16_224 one. I am also sure how to do the same fused_attn in this mode. Can you explain a bit what this code is about and why it procures different results when I comment it out?