Thomas Chambon's starred repositories
Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
video2dataset
Easily create large video dataset from video urls
HalfedgeCatmullClark
Supplemental source code for "A Halfedge Refinement Rule for Catmull Clark Subdivision"
PhotoMaker
PhotoMaker
GPTQ-triton
GPTQ inference Triton kernel
improved_edm
Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
T2I-Adapter-for-Diffusers
Transfer the T2I-Adapter with any basemodel in diffusers🔥
ControlNet-for-Diffusers
Transfer the ControlNet with any basemodel in diffusers🔥
Lora-for-Diffusers
The most easy-to-understand tutorial for using LoRA (Low-Rank Adaptation) within diffusers framework for AI Generation Researchers🔥
RVC-Studio
The best looking and most functional webui for RVC related tasks. See website for UI demo:
wav2lip-hq-updated-ESRGAN
Updated fork of wav2lip-hq allowing for the use of current ESRGAN models
SadTalker-Video-Lip-Sync
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。