Bobo-y / flexible-yolov5

I saw this and I was wondering why it is not outputting all features.

flexible-yolov5/od/models/backbone/swin_transformer.py

Line 620 in 8f8719e

return outs[1:]

Is there a way to use all four?

@sarmientoj24 outs include [ /4, /8, / 16, /32], [1:] means reutrn 2nd to last

so it only outputs two feature pyramids, is that what it is?

when i tried it it outputs this

swin = swin_transformer(version='tiny')
_ = swin.to('cuda')
swin_x = swin(sample)
swin_x[0].shape, swin_x[1].shape, swin_x[2].shape

(torch.Size([1, 192, 80, 80]),
 torch.Size([1, 384, 40, 40]),
 torch.Size([1, 768, 20, 20]))

when i tried it it outputs this

swin = swin_transformer(version='tiny')
_ = swin.to('cuda')
swin_x = swin(sample)
swin_x[0].shape, swin_x[1].shape, swin_x[2].shape

(torch.Size([1, 192, 80, 80]),
 torch.Size([1, 384, 40, 40]),
 torch.Size([1, 768, 20, 20]))

what !!!!! that a big bug, i will check it

thanks

also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?

also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?

of course

also, is it possible to use all of the feature pyramids that it is outputting for the neck and YOLO heads?

of course

#123 (comment)

when i tried it it outputs this

swin = swin_transformer(version='tiny')
_ = swin.to('cuda')
swin_x = swin(sample)
swin_x[0].shape, swin_x[1].shape, swin_x[2].shape

(torch.Size([1, 192, 80, 80]),
 torch.Size([1, 384, 40, 40]),
 torch.Size([1, 768, 20, 20]))

what !!!!! that a big bug, i will check it

outs of. swin
`
len: 4

torch.Size([24, 96, 160, 160])

torch.Size([24, 192, 80, 80])

torch.Size([24, 384, 40, 40])

torch.Size([24, 768, 20, 20])
`

there is no bug, the origin outs is a list with four elements, [/4, /8, /16, /32], i only use 2nd to last

I think that is true. That there are 4 elements there. But this

 return outs[1:]

just returns three which are these

torch.Size([24, 192, 80, 80])

torch.Size([24, 384, 40, 40])

torch.Size([24, 768, 20, 20])

My follow up question is why not output all of them on a forward pass? Since torch.Size([24, 96, 160, 160]) is a large feature map. From what I know, YOLOv5 has a P6 version where it takes four feature maps. Is that for that?

I think that is true. That there are 4 elements there. But this
 return outs[1:] 
just returns three which are these
torch.Size([24, 192, 80, 80])

torch.Size([24, 384, 40, 40])

torch.Size([24, 768, 20, 20])
My follow up question is why not output all of them on a forward pass? Since torch.Size([24, 96, 160, 160]) is a large feature map. From what I know, YOLOv5 has a P6 version where it takes four feature maps. Is that for that?

in this repo, only use three detection layers, so i return three. yes, you can think 160, 160 is P2. P6 need four, and p7 need five. You need to implement it yourself, you can refer this #126 (comment)

Why is the Swin-Transformer model only outputting 2nd features onwards?