Knowledge_Distillation_Papers

This are knowledge distillation papers.

CVPR2023

DaFKD : Domain-aware Federated Knowledge Distillation
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
DisWOT: Student Architecture Search for Distillation WithOut Training
Generic-to-Specific Distillation of Masked Autoencoders
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

Lion: Adversarial Distillation of Closed-Source Large Language Model
Distilling Step-by-Step Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

VanillaKD : Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale