facebookresearch / mvit

Code Release for MViTv2 on Image Recognition.

Repository from Github https://github.comfacebookresearch/mvitRepository from Github https://github.comfacebookresearch/mvit

spatial-temporal relative positional embedding?

JunweiLiang opened this issue · comments

Any suggestions on how to implement spatial-temporal relative positional embeddings? I'm trying to extend based on the cal_rel_pos_spatial function in attention.py

Thanks,
Junwei

Hi, you can follow cal_rel_pos_spatial to add a separate temporal rel pos embedding.
Our official spatial-temporal rel pos embedding will be released in PySlowFast (our video codebase) soon together with the MViTv2 video models.