Shashank Krishna Vempati's repositories
ShashankKrishnaV
Welcome to my profile.
COL-783-Digital-Image-Processing-2023
All assignments along with reports
awesome-vlm-architectures
Famous Vision Language Models and Their Architectures
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
indic-gen-bench
IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing generation tasks across a diverse set 29 of Indic languages covering 13 scripts and 4 language families.
Book-Understanding-Deep-Learning
Understanding Deep Learning - Simon J.D. Prince
open_clip
An open source implementation of CLIP.
All-Language-OCRs
Model checkpoints are uploaded here
Group-Chat-Video-and-Audio-call
This application integrates 3 main features i.e Group chat, video and audio calling. Go through the documentation before executing the files.
TEXTRON
Data Programming for Text Detection in Documents using SPEAR
hiertext
The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
urdu-synth
High-quality synthetic text data generation for Urdu Text Recognition
PlotNeuralNet
Latex code for making neural networks diagrams
UTRNet-High-Resolution-Urdu-Text-Recognition
UTRNet: High Resolution Multi-scale Feature Maps For Accurate Recognition Of Printed Urdu Text (ICDAR'23)
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Scene-Text-Recognition-Recommendations
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
ml-papers
My collection of machine learning papers
Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
OCR-V4-IIITH
Indian Language OCR
New_York_CitiBike-Tableau-challenge
New York CitiBike Tableau
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
HLExt-via-IS-LineDetection
Line Extraction in Handwritten Documents via Instance Segmentation
EasyOCR-Reference
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.