This are knowledge distillation papers.
- DaFKD : Domain-aware Federated Knowledge Distillation
- Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
- DisWOT: Student Architecture Search for Distillation WithOut Training
- Generic-to-Specific Distillation of Masked Autoencoders
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
- Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
- Lion: Adversarial Distillation of Closed-Source Large Language Model
- Distilling Step-by-Step Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
- On Distillation of Guided Diffusion Models
- VanillaKD : Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale