Xinyu Wang's repositories
OCRDatasets
A collection of OCR-related datasets
LVLM-Playground
[ICLR2025] Are Large Vision Language Models Good Game Players?
AdelaiDet
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
EvalAI-Starters
How to create a challenge on EvalAI?
latr
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answering (STVQA)
minimal-mistakes
:triangular_ruler: Jekyll theme for building a personal site, blog, project documentation, or portfolio.
mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Multi-Modal-ML
This repo collects Multi-modal Machine Learning papers.
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
project-tools
Useful tools for building AI/ML projects