Why is the Swin-Transformer model only outputting 2nd features onwards?
sarmientoj24 opened this issue · comments
I saw this and I was wondering why it is not outputting all features.
Is there a way to use all four?
@sarmientoj24 outs include [ /4, /8, / 16, /32], [1:] means reutrn 2nd to last
so it only outputs two feature pyramids, is that what it is?
when i tried it it outputs this
swin = swin_transformer(version='tiny')
_ = swin.to('cuda')
swin_x = swin(sample)
swin_x[0].shape, swin_x[1].shape, swin_x[2].shape
(torch.Size([1, 192, 80, 80]),
torch.Size([1, 384, 40, 40]),
torch.Size([1, 768, 20, 20]))
when i tried it it outputs this
swin = swin_transformer(version='tiny') _ = swin.to('cuda') swin_x = swin(sample) swin_x[0].shape, swin_x[1].shape, swin_x[2].shape (torch.Size([1, 192, 80, 80]), torch.Size([1, 384, 40, 40]), torch.Size([1, 768, 20, 20]))
what !!!!! that a big bug, i will check it
thanks
also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?
also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?
of course
also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?
of course
when i tried it it outputs this
swin = swin_transformer(version='tiny') _ = swin.to('cuda') swin_x = swin(sample) swin_x[0].shape, swin_x[1].shape, swin_x[2].shape (torch.Size([1, 192, 80, 80]), torch.Size([1, 384, 40, 40]), torch.Size([1, 768, 20, 20]))
what !!!!! that a big bug, i will check it
outs of. swin
`
len: 4
torch.Size([24, 96, 160, 160])
torch.Size([24, 192, 80, 80])
torch.Size([24, 384, 40, 40])
torch.Size([24, 768, 20, 20])
`
there is no bug, the origin outs is a list with four elements, [/4, /8, /16, /32], i only use 2nd to last
I think that is true. That there are 4 elements there. But this
return outs[1:]
just returns three which are these
torch.Size([24, 192, 80, 80])
torch.Size([24, 384, 40, 40])
torch.Size([24, 768, 20, 20])
My follow up question is why not output all of them on a forward pass? Since torch.Size([24, 96, 160, 160])
is a large feature map. From what I know, YOLOv5 has a P6 version where it takes four feature maps. Is that for that?
I think that is true. That there are 4 elements there. But this
return outs[1:]
just returns three which are these
torch.Size([24, 192, 80, 80]) torch.Size([24, 384, 40, 40]) torch.Size([24, 768, 20, 20])
My follow up question is why not output all of them on a forward pass? Since
torch.Size([24, 96, 160, 160])
is a large feature map. From what I know, YOLOv5 has a P6 version where it takes four feature maps. Is that for that?
in this repo, only use three detection layers, so i return three. yes, you can think 160, 160 is P2. P6 need four, and p7 need five. You need to implement it yourself, you can refer this #126 (comment)