yun-liu / HAT-Net

Vision Transformers with Hierarchical Attention

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Installation

This repository exactly follows the code and the training settings of PVT.

Image classification on the ImageNet-1K dataset

Methods Size #Params #FLOPs Acc@1 Pretrained Models
HAT-Net-Tiny 224 x 224 12.7M 2.0G 79.8 Google / Github
HAT-Net-Small 224 x 224 25.7M 4.3G 82.6 Google / Github
HAT-Net-Medium 224 x 224 42.9M 8.3G 84.0 Google / Github
HAT-Net-Large 224 x 224 63.1M 11.5G 84.2 Google / Github

Citation

If you are using the code/models provided here in a publication, please consider citing:

@article{liu2024vision,
  title={Vision Transformers with Hierarchical Attention},
  author={Liu, Yun and Wu, Yu-Huan and Sun, Guolei and Zhang, Le and Chhatkuli, Ajad and Van Gool, Luc},
  journal={Machine Intelligence Research},
  year={2024}
}

@article{liu2021transformer,
  title={Transformer in Convolutional Neural Networks},
  author={Liu, Yun and Sun, Guolei and Qiu, Yu and Zhang, Le and Chhatkuli, Ajad and Van Gool, Luc},
  journal={arXiv preprint arXiv:2106.03180},
  year={2021}
}

About

Vision Transformers with Hierarchical Attention


Languages

Language:Python 100.0%