王鹤男's repositories
table_structure_recognition
Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, and you can get the same (even better) result compared with Table Transformer (TATR) with smaller models.
CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Easy-Wav2Lip
Colab for making Wav2Lip high quality and easy to use
EPLB
Expert Parallelism Load Balancer
facefusion
Next generation face swapper and enhancer
flash-attention
Fast and memory-efficient exact attention
FlashMLA
FlashMLA: Efficient MLA decoding kernels
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
iTransformer
Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting".
MiniCPM-o
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
OpenCastKit
The open-source solutions of FourCastNet and GraphCast
pangu-pytorch
Weather forecast at 1/3/6/24-hour horizon
table-transformer
Model training and evaluation code for our dataset PubTables-1M, developed to support the task of table extraction from unstructured documents.
torchrec
Pytorch domain library for recommendation systems
TriplaneGaussian
TriplaneGaussian: A new hybrid representation for single-view 3D reconstruction.
whisper.cpp
Port of OpenAI's Whisper model in C/C++
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection