This repository contains the code to replicate the results in the paper: "GULP: a prediction-based metric between representations".
The experiments are split into three groups which study distances between network representations on MNIST, ImageNet, and CIFAR. For all experiments, distances between network representations are presaved in this repository. However, if the user wishes to recompute these distances or access the network representations themselves (which are memory intensive), they will need to load/train the corresponding network architectures and recompute the distances between them.
- Distances between fully-connected ReLU networks of varying widths and depths are saved in
distances/widthdepth/
. - To recompute these distances, first clear the folder
distances/widthdepth/
and load the MNIST dataset into thedata/MNIST/
placeholder folder. Then run the slurm scriptfit_loop.sh
which will train fully-connected ReLU networks on MNIST of varying width and depth and save these architectures tomodels/widthdepth/
. Finally, run the slurm scriptdist_loop.sh
to compute all pairwise distance between the final-layer representations of these networks which will be saved indistances/widthdepth/
. - All visualizations of MNIST networks in Figures 5 and 12 of the paper can be reproduced in the notebook
width_depth_embedding.ipynb
. - The relationships between distances on MNIST models in Figure 9 can be reproduced in the notebook
Compare_other_distances_to_GULP.ipynb
.
- Distances between pretrained and untrained state-of-the-art ImageNet networks taken from https://pytorch.org/vision/stable/models.html#classification are saved in
distances/train/pretrained/
anddistances/train/untrained/
respectively. - To recompute these distances, first clear the folders
distances/train/pretrained
anddistances/train/untrained
. Load all untrained and pretrained PyTorch ImageNet models by runningload_models.py
. Then load the ImageNet dataset into a local folder and save its path. Paste this path into the filecompute_reps.py
and run the slurm scriptrep_loop.sh
which will save the final-layer representations of all pretrained and untrained ImageNet networks loaded from PyTorch. Finally, run the slurm scriptdist_loop.sh
to compute all pairwise distance between these final-layer representations which will be saved indistances/train/pretrained
anddistances/train/untrained
respectively. - All visualizations of ImageNet networks in Figures 1, 6, 14, 15, 16, 17, and 18 of the paper can be reproduced in the notebook
embed_models.ipynb
. - The relationships between distances on ImageNet models in Figures 2 and 8 can be reproduced in the notebook
Compare_other_distances_to_GULP.ipynb
. - The convergence of the plug-in estimator in Figures 3 and 10 can be reproduced in the notebook
Convergence_of_the_plug_in_estimator.ipynb
. - How GULP captures generalization performance on linear predictors in Figure 4 (random labels) can be reproduced in the notebook
GULP_versus_linear_predictor_generalization.ipynb
. This requires loading the ImageNet representations (see above). - How GULP captures generalization performance on linear predictors in Figure 11 (age labels on UTKFace dataset) can be reproduced in the notebook
GULP_versus_linear_predictor_generalization_age_dataset.ipynb
. This requires loading the UTKFace dataset (see instructions in notebook) and the ImageNet models (see above). - The GULP distance versus generalization performance on logistic predictors in Figure 24 can be reproduced in the notebook
GULP_versus_logistic_predictor_generalization.ipynb
. This requires loading the ImageNet representations (see above).
-
The GULP distance between networks during training in Figures 7 and 19 of the paper can be reproduced in the notebook
plot_CIFAR_distances_during_training.ipynb
. This notebook uses distances between Resnet18 networks trained on CIFAR10 at every epoch (of 50 epochs) of training, which are in the subdirectorydistances_during_training
. We do not include code to retrain the networks, but refer to the FFCV package and/or any standard training code for training Resnet18 architectures on CIFAR10 and saving the final representations. We used precisely the training code for CIFAR Resnets available at https://github.com/MadryLab/failure-directions. Given representations saved in the hierarchy e.g.cifar1/test/epoch3/latents.pkl
, where1
indicates the index of the model,3
is the epoch,test
is the split of the data on which the representations were generated, andlatents.pkl
contains a dictionary with key 'last', the scriptcompute_CIFAR_distances_from_saved_representations.py
recomputes the provided distances. -
The CKA, Procrustes, and GULP distance between fully-connected networks of varying widths and depths in Figure 13 of the paper can be reproduced in the notebook
width_depth_embedding.ipynb
. This uses the computed distance data in the subdirectorydistances
. If you wish to recompute these distances, first clear the folderdistances/10000
. Then train the fully-connected networks by running thetrain_fc_nets.ipynb
notebook (this will download CIFAR10 into thedata
subdirectory). Then runLoad_fc_nets_on_cifar_representations.ipynb
to compute the CIFAR10 representations and save them in thereps
subdirectory. Now compute the pairwise distances between representations by running the slurm scriptdist_loop.sh
. These will be saved in `distances/10000'.
- Distances between representations at intermediate layers of 10 fully-trained BERT base NLP models each with different random initializations are saved in
full_df_self_computed.csv
. We obtained these pretrained BERT models from the code of Ding et al. https://github.com/js-d/sim_metric and followed the steps in their README file under "Adding a new metric" to compute our GULP distance between intermediate layers of these networks. - The GULP distance between intermediate layers of 10 pretrained BERT networks in Figure 20 can be reproduced in the notebook
layer_embedding.ipynb
.
All experiments in the appendix corresponding to Figures 21-23 were generated using the code in Ding et al. https://github.com/js-d/sim_metric. In addition to the existing metrics used in this codebase (CKA, Procrustes, etc.) we added our GULP distance following the steps in their README under "Adding a new metric".