vit

There are 2 repositories under vit topic.

LaTeX-OCR
lukas-blecher / LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
machine-learning transformer im2latex deep-learning image2text latex dataset pytorch im2markup ocr latex-ocr vit math-ocr vision-transformer image-processing python im2text
Language:Python 10705
cmhungsteve / Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
transformer attention-mechanism vision-transformer deep-learning awesome-list transformer-cv transformer-architecture transformer-awesome transformer-with-cv transformer-models visual-transformer computer-vision papers attention-mechanisms self-attention vit detr transformers
4212
towhee-io / towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
computer-vision convolutional-networks embedding-vectors embeddings feature-extraction feature-vector image-processing image-retrieval llm machine-learning milvus pipeline towhee transformer unstructured-data video-processing vision-transformer vit
Language:Python 2952
hila-chefer / Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
deep-learning vision-transformer bert-model bert explainability transformer-interpretability perturbation attention-visualization visualize-classifications vit attention-matrix cvpr2021
Language:Jupyter Notebook 1657
BR-IDL / PaddleViT
:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
cv computer-vision paddlepaddle vit mlp transformer encoder-decoder classification detection segmentation gan deep-learning semantic-segmentation object-detection
Language:Python 1183
yitu-opensource / T2T-ViT
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
t2t-transformer vision-transformer vit
Language:Jupyter Notebook 1113
inference
roboflow / inference
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
computer-vision inference-api inference-server vit yolact yolov5 yolov7 yolov8 jetson tensorrt classification instance-segmentation object-detection onnx deployment docker inference machine-learning python server
Language:Python 1017
Yangzhangcst / Transformer-in-Computer-Vision
A paper list of some recent Transformer-based CV works.
awesome computer-vision deep-learning detr papers transformer transformer-awesome transformer-cv vit
930
sail-sg / Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 mae optimizer pytorch resnet timm transformer-xl vit
Language:Python 726
chinhsuanwu / mobilevit-pytorch
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"
mobilenetv2 mobilevit vision-transformer vit
Language:Python 470
v-iashin / video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.
pytorch multi-gpu feature-extraction parallel video-features visual-features audio-features i3d vggish r2plus1d resnet raft optical-flow ig65m clip s3d laion swin timm vit
Language:Python 431
zgcr / SimpleAICV_pytorch_training_examples
SimpleAICV:pytorch training and testing examples.
pytorch darknet fcos resnet retinanet centernet ttfnet repvgg mae dino vit deeplabv3plus dml kd regnetx u2net aotgan solov2 yolact
Language:Python 394
open-compass / VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
gpt-4v large-language-models llava multi-modal openai vqa llm openai-api mplug-owl qwen gpt minigpt4 computer-vision pytorch gpt4 chatgpt clip vit benchmark evaluation
Language:Python 352
FFCSonTheGo
vatz88 / FFCSonTheGo
FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!
vit vellore ffcs timetable hacktoberfest javascript
Language:JavaScript 283
gupta-abhay / pytorch-vit
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
image-recognition transformers image-classification vit hybrid-vit vision-transformer
Language:Python 279
PaddlePaddle / PASSL
PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法
beit clip convnext cvt deep-learning deit mae moco moco-v2 paddle pixpro pvt self-supervised-learning simclr swav swin-transformer vision-transformer vit xcit
Language:Python 257
megvii-research / RevCol
Official Code of Paper "Reversible Column Networks" "RevColv2"
cnn computer-vision iclr2023 mae pytorch transformer vit
Language:Python 238
eeyhsong / EEG-Transformer
i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.
deep-learning attention-mechanism vit transformer attention common-spatial-pattern eeg eeg-classification physiological-signals
Language:Python 211
HugsVision
qanastek / HugsVision
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
huggingface transformers computer-vision pretrained-models image-classification semantic-segmentation object-detection image-generation pythorch pytorch-transformers yolo vit deit detr pytorch deep-learning machine-learning state-of-the-art bert torchvision
Language:Jupyter Notebook 188
implus / mae_segmentation
reproduction of semantic segmentation using masked autoencoder (mae)
mae masked-autoencoder self-supervised-learning semantic-segmentation vision-transformer vit
Language:Python 145
PaddlePaddle / PLSC
Paddle Large Scale Classification Tools，supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
face-recognition arcface cosface partial-fc data-parallel model-parallel large-scale paddlepaddle paddle distributed-training hight-speed deit resnet vit facevit cait moco-v3 mae convmae swin-transformer
Language:Python 144
yaoxiaoyuan / mimix
Mimix: A Text Generation Tool and Pretrained Chinese Models
chinese-chatbot chinese-nlp gpt-2 poetry-generation question-generation seq2seq summarization text-similarity comment-generation essay-generation generative-qa product-description-generation product-review-generation pretrained-models novel-generation chinese-english-translator tag-generation spelling-correction vit clip
Language:Python 144
hunto / LightViT
Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers"
backbone imagenet lightvit vit
Language:Python 134
xmindflow / Awesome-Transformer-in-Medical-Imaging
[MedIA Journal] An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
attention-mechanism awesome-list computer-vision deep-learning medical-image-segmentation segmentation transformer transformers vision-transformer vit
134
zwcolin / EEG-Transformer
A ViT based transformer applied on multi-channel time-series EEG data for motor imagery classification
bci eeg-classification transformer vit
Language:Python 126
kyegomez / NaViT
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
vit attention-mechanism clip gpt4 multimodal multimodal-deep-learning multimodal-learning multimodality
Language:Python 115
vitjs / vit
🚀 React application framework inspired by UmiJS / 类 UmiJS 的 React 应用框架
vite react vite-plugin vite-plugin-react vit vitjs react-framework umi umijs mock-data
Language:TypeScript 99
kamalkraj / Vision-Transformer
Vision Transformer using TensorFlow 2.0
tensorflow vit transformer image-classification
Language:Python 93
jaehyunnn / ViTPose_pytorch
An unofficial implementation of ViTPose [Y. Xu et al., 2022]
computer-vision human-pose pose-estimation transformers vision-transformers vit vitpose
Language:Jupyter Notebook 91
rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.
deep-learning llm memory-optimization pytorch vision vit
Language:Python 80
ssitvit / Code-Canvas
A hub for innovation through web development projects
css gssoc23 html vit js
Language:JavaScript 77
daniel-code / TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
deep-learning paper-implementations pytorch tube-vit tubevit video video-classification vit
Language:Python 71
kyegomez / Vit-RGTS
Open source implementation of "Vision Transformers Need Registers"
attention-mechanism gpt4 vision-api vision-transformer vit
Language:Python 68
hunto / image_classification_sota
Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.
cifar image-classification imagenet kd nas pruning pytorch rep transformer vit
Language:Python 63
szq0214 / SReT
Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"
vit transformer-architecture vision-transformer efficient-transformers efficient-neural-networks
Language:Python 62
uta-smile / TVT
Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation, WACV 2023
domain-adaptation pytorch transfer-learning vision-transformer vit
Language:Python 60

vit

lukas-blecher / LaTeX-OCR

cmhungsteve / Awesome-Transformer-Attention

towhee-io / towhee

hila-chefer / Transformer-Explainability

BR-IDL / PaddleViT

yitu-opensource / T2T-ViT

roboflow / inference

Yangzhangcst / Transformer-in-Computer-Vision

sail-sg / Adan

chinhsuanwu / mobilevit-pytorch

v-iashin / video_features

zgcr / SimpleAICV_pytorch_training_examples

open-compass / VLMEvalKit

vatz88 / FFCSonTheGo

gupta-abhay / pytorch-vit

PaddlePaddle / PASSL

megvii-research / RevCol

eeyhsong / EEG-Transformer

qanastek / HugsVision

implus / mae_segmentation

PaddlePaddle / PLSC

yaoxiaoyuan / mimix

hunto / LightViT

xmindflow / Awesome-Transformer-in-Medical-Imaging

zwcolin / EEG-Transformer

kyegomez / NaViT

vitjs / vit

kamalkraj / Vision-Transformer

jaehyunnn / ViTPose_pytorch

rasbt / pytorch-memory-optim

ssitvit / Code-Canvas

daniel-code / TubeViT

kyegomez / Vit-RGTS

hunto / image_classification_sota

szq0214 / SReT

uta-smile / TVT