Further distillation papers to consider

Question

Further distillation papers to consider

begab opened this issue 4 months ago · comments

Thanks for the great repo, these additional papers related to masked latent semantic modeling (in which pre-training is achieved by recovering latent semantic information extracted from a teacher model) might fit the scope of the survey as well:

Masked Latent Semantic Modeling: an Efficient Pre-training Alternative to Masked Language Modeling
Better Together: Jointly Using Masked Latent Semantic Modeling and Masked Language Modeling for Sample Efficient Pre-training

Shawn Xu · Answer 1 · Thu Apr 11 2024 14:11:28 GMT+0800 (China Standard Time)

Great work! Thanks! We have added them into the repo :)