osiriszjq / impulse_init

Convolutional Initialization for Data-Efficient Vision Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Convolutional Initialization for Data-Efficient Vision Transformers

License: MIT

Jianqiao Zheng, Xueqian Li, Simon Lucey
The University of Adelaide

This is the official implementation of the paper "Convolutional Initialization for Data-Efficient Vision Transformers", including a modified version of ConvMixer and Simple ViT on CIFAR-10, CIFAR-100, SVHN and Tiny ImageNet. The code is based on vision-transformers-cifar10

Illustration of different methods to extend 1D encoding

Illustration of different methods to extend 1D encoding

Google Colab

Explore Siren in Colab
If you want to try out our new initialization for ViT, check this Colab for a quick tour.

Usage

Modify convmixer.sh or vit_pex.sh first to change the data path and what experiments you want to run, and then just run

bash convmixer.sh

or

bash vit_pex.sh

Citation

@article{zheng2024convolutional,
  title={Convolutional Initialization for Data-Efficient Vision Transformers},
  author={Zheng, Jianqiao and Li, Xueqian and Lucey, Simon},
  journal={arXiv preprint arXiv:2401.12511},
  year={2024}
}

About

Convolutional Initialization for Data-Efficient Vision Transformers

License:MIT License


Languages

Language:Jupyter Notebook 86.4%Language:Python 13.2%Language:Shell 0.4%