dk-liang / TransCrowd

[SCIS] TransCrowd: Weakly-Supervised Crowd Counting with Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training token model.... where is the regression token?

JavierUrenaPhDProjects opened this issue · comments

So im playing with this model around to see exactly how it works at code level, and as far as I know the 'token' model uses a regression token to the input sequence Z0 for the counting, creating a size of HW/K² + 1 input in the regression head (being K the number of patches, HW the dimensions of the image). But i am not able to recognize the explicit difference between the 'token' and 'gap' regression heads inputs in the code.

Could you give me more explanation on how this "regression token" is created and where? and what it is exactly? the paper does not give much enough information about it...

Please see the ViT for more detail (class token)