rayguan97 / M3DETR

Code base for M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions on positional embeddings

DianCh opened this issue · comments

Hi! Can you help with a few questions on positional embeddings:

  1. Did you apply any positional embedding in the Transformer attention?
  2. If so, how did you design the positional embeddings for different types of representation and multi-scale features, and how did you apply them (i.e. how the formula should write)?
  3. If not, what's the consideration behind? Why is it not needed?

Thank you! I look forward to your reply and the code release.

Thanks for your interest in our work. We apologize for answering late.

We don't use positional encoding in our attention module. We believe that positional embedding is very likely to improve the performance, but our work is mainly focused on making transformer as a fusion method, and make the whole architecutre work in the 3D object detection task.