[todo] BERT loss on MSA
lucidrains opened this issue · comments
Phil Wang commented
needs to be added
Phil Wang commented
need to make noise accurate to the paper (contiguous spans?)
Phil Wang commented
Phil Wang commented
used at inference time.. 🤔
Siqi commented
used at inference time.. 🤔
Yep, I also noticed this part, which seems really weird to me...