cuongngm

Cuog Ng's starred repositories

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.0133033 1038 7441

gpt-engineer

Specify what you want it to build, the AI asks for clarification, and then builds it.

Language:PythonMIT50972 501 459

DocsGPT

GPT-powered chat for documentation, chat with your documents

Language:PythonMIT14314 87 346

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonGPL-3.011135 48 123

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonMIT8283 69 190

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.07190 98 1415

Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Language:JavaScriptApache-2.05682 82 162

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonMIT5435 45 288

OutfitAnyone

Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person

5183 209 51

CompreFace

Leading free and open-source face recognition system

Language:JavaApache-2.04683 78 278

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonMIT3356 30 249

anylabeling

Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything, MobileSAM!!

Language:PythonGPL-3.01935 20 119

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

Language:PythonBSD-3-Clause1617 15 21

AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

Language:C++Apache-2.01062 26 138

salt

Segment Anything Labelling Tool

Language:PythonMIT993 9 36

parseq

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

Language:PythonApache-2.0511 13 133

OCR-SAM

Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting

Language:Python470 5 22