thinh-re / mae

Train MAE on Kaggle 2 GPUs (T4x2), Log to Wandb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Masked Autoencoders (MAE): Train on Kaggle

This is an example of training MAE on Kaggle

Environments

  • Kaggle: enabled GPU T4x2
  • Python: 3.10
  • Datasets: mae-v1 (8,025 images) or ImageNet1K (1,281,167 images)
  • Log to wandb.ai

How to run

  • Import ipynb mae-kaggle.ipynb into Kaggle.
  • Choose one of these datasets: mae-v1 (8,025 images) or ImageNet1K (1,281,167 images)
  • Use your own Wandb API token
  • Update hyperparameters or leave them as default
  • Run all and enjoy!

Key Takeaways

  • I do not guarantee to reproduce the same results as in the paper (it takes too much time to train the entire ImageNet1K)
  • I notice that we can set the batch size up to 130 per GPU (that means 260 in total) when training the base model (mae_vit_base_patch16)

Acknowledgement

This repository is inspired by MAE (Original paper Masked Autoencoders Are Scalable Vision Learners )

@inproceedings{he2022masked,
  title={Masked autoencoders are scalable vision learners},
  author={He, Kaiming and Chen, Xinlei and Xie, Saining and Li, Yanghao and Doll{\'a}r, Piotr and Girshick, Ross},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={16000--16009},
  year={2022}
}

About

Train MAE on Kaggle 2 GPUs (T4x2), Log to Wandb

License:Other


Languages

Language:Python 97.5%Language:Jupyter Notebook 2.1%Language:Shell 0.4%