This is accompanying code for the paper "Zero-shot Knowledge Transfer via Adversarial Belief Matching" see arxiv
Our task is to compress a large neural network (teacher) into a smaller one (student), but we assume that the data used to train the teacher is not available anymore. We thus generate pseudo points adversarially (yellow markers above) and use those to match the student (right) to the teacher (left).
- Python 3.6
- pytorch 1.0.0 (both cpu and gpu version tested)
- tensorboard 1.7.0 (for logging, + needs tensorflow)
- Pretrain a teacher for the dataset/architecture you want, or download some of mine here
- Make sure you have the same folder structure as in the link above, i.e. Pretrained/{dataset}/{architecture}/last.pth.tar
- Edit the paths in scripts/ZeroShot/main0.sh and run it
- Pretrain a zero-shot student or a student with KD+AT
- Edit the paths in scripts/TransitionCurves/transition_curves0.sh and run it
- This saves .pickle file with all the transition curves
If you build on this method or use this code please consider citing:
@article{Micaelli2019ZeroShotKT,
author = {Paul Micaelli and
Amos Storkey},
title = {Zero-shot Knowledge Transfer via Adversarial Belief Matching},
url = {https://arxiv.org/abs/1905.09768}
}