Cong Liang's starred repositories
Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
so-vits-svc
SoftVC VITS Singing Voice Conversion
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
T2I-Adapter
T2I-Adapter
sherpa-onnx
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
Deep3DFaceRecon_pytorch
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.
emoca
Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild
Awesome-Talking-Head-Synthesis
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
syncnet_python
Out of time: automated lip sync in the wild
VAD-python
Voice Activity Detector in Python
awesome-faceReenactment
papers about Face Reenactment/Talking Face Generation
EmoTalk_release
This is the official source for our ICCV 2023 paper "EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation"
DiffGesture
[CVPR 2023] Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
sugar-wifi-conf
A BLE service on raspberry pi for wifi configuration and wireless control. 使用微信小程序随时随地设置树莓派wifi连接,控制树莓派
DiffSpeaker
This is the official repository for DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer
Face_Landmark_Link
creates live link app blendshape data formated in csv from video, for facial motion capture
AvatarWebKit
Web-first SDK that provides real-time ARKit-compatible 52 blend shapes from a camera feed, video or image at 60 FPS using ML models.