Further distillation papers to consider
begab opened this issue · comments
Thanks for the great repo, these additional papers related to masked latent semantic modeling (in which pre-training is achieved by recovering latent semantic information extracted from a teacher model) might fit the scope of the survey as well:
Great work! Thanks! We have added them into the repo :)