Nguyễn Hoàng Long 's starred repositories
GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
build-nanogpt
Video+code lecture on building nanoGPT from scratch
segment-caption-anything
[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gradio demo that show how to use the model.
torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
EEG-Conformer
EEG Transformer 2.0. i. Convolutional Transformer for EEG Decoding. ii. Novel visualization - Class Activation Topography.
lightning-asr
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.
conformer_ocr
Transformer OCR is a Optical Character Recognition tookit built for researchers working on both OCR for both Vietnamese and English. This project only focused on variants of vanilla Transformer (Conformer) and Feature Extraction (CNN-based approach).
x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
speechbrain
A PyTorch-based Speech Toolkit
chat-with-mlx
An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
screenshot-to-code
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
SwinTextSpotter
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)