leung's repositories
3DDFA
The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.
bark
🔊 Text-Prompted Generative Audio Model
cat-sam
The official implementation of "CAT-SAM: Conditional Tuning Network for Few-Shot Adaptation of Segmentation Anything Model".
CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
EagleEye
(ECCV'2020 Oral)EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
ECCV20-STDN
Source code for ECCV 2020 paper: On Disentangling Spoof Trace for Generic Face Anti-Spoofing
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
face-alignment
:fire: 2D and 3D Face alignment library build using pytorch
face_toolbox_keras
A collection of deep learning frameworks ported to Keras for face analysis.
mediapipe
MediaPipe is the simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web.
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
MobileVLM
Strong and Open Vision Language Assistant for Mobile Devices
open_clip
An open source implementation of CLIP.
unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities