So Uchida's starred repositories
VizWiz2024-VQA-AnswerTherapy
[2024VizWiz] Vision-Language Model-based PolyFormer for Recognizing Visual Questions with Multiple Answer Groundings
content-debiased-fvd
[CVPR 2024] On the Content Bias in Fréchet Video Distance
segment-caption-anything
[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gradio demo that show how to use the model.
ESTextSpotter
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
efficient-kan
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
extension-cpp
C++ extensions in PyTorch
awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
SwinTextSpotter
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)
Bridging-Text-Spotting
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
DPText-DETR
[AAAI'23 Oral] DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
schedule_free
Schedule-Free Optimization in PyTorch
tao_pytorch_backend
TAO Toolkit deep learning networks with PyTorch backend
doc3D-dataset
A hybrid dataset for document unwarping (Paper: https://www3.cs.stonybrook.edu/~cvl/projects/dewarpnet/storage/paper.pdf)