Awesome Vit Quantization and Acceleration
đ Dive into the cutting-edge with this curated list of papers on Vision Transformers (ViT) quantization and hardware acceleration , featured in top-tier AI conferences and journals. This collection is meticulously organized and draws upon insights from our comprehensive survey:
[Arxiv] Model Quantization and Hardware Acceleration for Vision
Transformers: A Comprehensive Survey
Activation Quantization Optimization
Date
Title
Paper
Code
2021.11
âPTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantizationâ
[ECCVâ22]
[code]
2021.11
âFQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformerâ
[IJCAIâ22]
[code]
2022.12
âRepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformersâ
[ICCVâ23]
[code]
2023.03
âTowards Accurate Post-Training Quantization for Vision Transformerâ
[MMâ22]
-
2023.05
âTSPTQ-ViT: Two-scaled post-training quantization for vision transformerâ
[ICASSPâ23]
-
2023.11
âI&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantizationâ
[Arxiv]
[code]
2024.01
âMPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformerâ
[Arxiv]
-
2024.01
âLRP-QViT: Mixed-Precision Vision Transformer Quantization via Layer-wise Relevance Propagationâ
[Arxiv]
-
2024.02
âRepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterizationâ
[Arxiv]
-
2024.04
âInstance-Aware Group Quantization for Vision Transformersâ
[Arxiv]
-
2024.05
âP^2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformerâ
[Arxiv]
[code]
Calibration Optimization For PTQ
Date
Title
Paper
Code
2021.06
âPost-Training Quantization for Vision Transformerâ
[NIPS 2021]
[code]
2021.11
âPTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantizationâ
[ECCVâ22]
[code]
2022.03
âPatch Similarity Aware Data-Free Quantization for Vision Transformersâ
[ECCVâ22]
[code]
2022.09
âPSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformersâ
[TNNLSâ23]
[code]
2022.11
âNoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformersâ
[CVPRâ23]
-
2023.03
âTowards Accurate Post-Training Quantization for Vision Transformerâ
[MMâ22]
-
2023.05
âFinding Optimal Numerical Format for Sub-8-Bit Post-Training Quantization of Vision Transformersâ
[ICASSPâ23]
-
2023.08
âJumping through Local Minima: Quantization in the Loss Landscape of Vision Transformersâ
[ICCVâ23]
[code]
2023.10
âLLM-FP4: 4-Bit Floating-Point Quantized Transformersâ
[EMNLPâ23]
[code]
2024.05
âP^2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformerâ
[Arxiv]
[code]
Gradient-base Optimization For QAT
Date
Title
Paper
Code
2022.01
âTerViT: An Efficient Ternary Vision Transformerâ
[Arxiv]
-
2022.10
âQ-ViT: Accurate and Fully Quantized Low-bit Vision Transformerâ
[NIPSâ22]
[code]
2022.12
âQuantformer: Learning Extremely Low-Precision Vision Transformersâ
[TPAMIâ22]
-
2023.02
âOscillation-free Quantization for Low-bit Vision Transformersâ
[PMLRâ23]
[code]
2023.05
âBoost Vision Transformer with GPU-Friendly Sparsity and Quantizationâ
[CVPRâ23]
-
2023.06
âBit-Shrinking: Limiting Instantaneous Sharpness for Improving Post-Training Quantizationâ
[CVPRâ23]
-
2023.07
âVariation-aware Vision Transformer Quantizationâ
[Arxiv]
[code]
2023.12
âPackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobileâ
[NIPSâ23]
-
Date
Title
Paper
Code
2022.11
âBiViT: Extremely Compressed Binary Vision Transformerâ
[ICCVâ23]
-
2023.05
âBinaryViT: Towards Efficient and Accurate Binary Vision Transformersâ
[Arxiv]
-
2023.06
âBinaryViT: Pushing Binary Vision Transformers Towards Convolutional Modelsâ
[CVPRâ23]
[code]
2024.05
âBinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computingâ
[TII]
-
Non-linear Operations Acceleration
Date
Title
Paper
Code
2021.11
âFQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformerâ
[IJCAIâ22]
[code]
2022.07
âI-ViT: Integer-only Quantization for Efficient Vision Transformer Inferenceâ
[ICCVâ23]
[code]
2023.06
âPractical Edge Kernels for Integer-Only Vision Transformers Under Post-training Quantizationâ
[MLSYSâ23]
-
2023.10
âSOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inferenceâ
[ICCADâ23]
-
2023.12
âPackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobileâ
[NIPSâ23]
-
2024.05
âP^2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformerâ
[Arxiv]
[code]
Date
Title
Paper
Code
2022.01
âVAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformerâ
[Arxiv]
-
2022.08
âAuto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantizationâ
[FPLâ22]
-
2023.10
âAn Integer-Only and Group-Vector Systolic Accelerator for Efficiently Mapping Vision Transformer on Edgeâ
[TCAS-Iâ23]
-
2023.10
âSOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inferenceâ
[ICCADâ23]
-
2024.05
âP^2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformerâ
[Arxiv]
[code]
If you find our survey useful or relevant to your research, please kindly cite our paper:
@misc{du2024model,
title={Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey},
author={Dayou Du and Gu Gong and Xiaowen Chu},
year={2024},
eprint={2405.00314},
archivePrefix={arXiv},
primaryClass={cs.LG}
}