What's the output features shape for regression?

Question

What's the output features shape for regression?

becauseofAI opened this issue 5 years ago · comments

Which of the following tensor shape of output features is correct?
Take batch=N, input =512 , R=4, C=80 (COCO ) as an example：
（N，128，128，80，2，2）? or （N，128，128，80 + 2 + 2) ?

xingyizhou/CenterNet#196

see-- · Answer 1 · Fri Jul 12 2019 19:15:09 GMT+0800 (China Standard Time)

Both are wrong. Read the code or the paper. You have 3 outputs. Centers are （N，128，128，80).

becauseofAI · Answer 2 · Fri Jul 12 2019 22:02:25 GMT+0800 (China Standard Time)

@see--
Centers are （N，128，128，80)
W and H of Centers are（N，128，128，2)
Offset of Centers are（N，128，128，2)

So only one target box can be predicted for the same or different categories with overlapping centers?

For example, a cat and an elephant coincide at the center, but their sizes vary greatly. But the center overlap of a cat and an elephant can only predict one category of elephant or cat, but can not simultaneously predict elephant and cat?

see-- · Answer 3 · Fri Jul 12 2019 22:32:45 GMT+0800 (China Standard Time)

The paper has some great sections answering your questions. In short: You are right. But the important part is that it rarely happens. You have much fewer collisions/lost boxes with CenterNet than with any other approach.