Small naming error - masking generation
lilygeorgescu opened this issue · comments
First, congratulate for this solid work!
I am working with your code for a while (which is well written, thank you) and I was looking more into the masking strategy, I had some misunderstandings because the variable at line:
https://github.com/facebookresearch/mae/blob/efb2a8062c206524e35e47d04501ed4f544c0ae8/models_mae.py#L140C9-L140C18
x_masked = torch.gather(x, dim=1, index=ids_keep.unsqueeze(-1).repeat(1, 1, D))
in misleading because, those are actually the unmasked tokens (since they are then forwarded through the encoder and their number matches the unmasked tokens.
Am I the only one who thinks so? Or I misunderstood something?
Thanks,
Lili
Here, x_masked
means "what's left of the image x after masking out some fraction of tokens." I can see why it's a bit confusing!