yfeng95 / PoseGPT

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clarification on obtaining the embedding related to the <POSE> token

AndrejHafner opened this issue · comments

Hello! First of all, thank you for the great article. I have a question about how you obtain the embedding related to the token, which is then projected and used for human pose reconstruction. If I understand correctly, when the model outputs a token, you take the logits from the last layer of the LLM (on which softmax was applied and from the resulting distribution the token was sampled) and use those as embeddings?

commented

I think it's the last-layer embedding(hidden_states, before logits) corresponding to the <POSE> token. You can reference LISA https://github.com/dvlab-research/LISA.