Thomas Chambon's starred repositories

Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:2537Issues:0Issues:0

hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Language:PythonLicense:MITStargazers:7177Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14Issues:0Issues:0

Bend

A massively parallel, high-level programming language

Language:RustLicense:Apache-2.0Stargazers:16826Issues:0Issues:0

FastUI

Build better UIs faster.

Language:PythonLicense:MITStargazers:7863Issues:0Issues:0

video2dataset

Easily create large video dataset from video urls

Language:PythonLicense:MITStargazers:505Issues:0Issues:0

MoVQGAN

MoVQGAN - model for the image encoding and reconstruction

Language:Jupyter NotebookStargazers:110Issues:0Issues:0

dj_fft

Header only FFT library

Language:C++License:NOASSERTIONStargazers:166Issues:0Issues:0

HalfedgeCatmullClark

Supplemental source code for "A Halfedge Refinement Rule for Catmull Clark Subdivision"

Language:CLicense:NOASSERTIONStargazers:30Issues:0Issues:0
Language:PythonStargazers:114Issues:0Issues:0

PhotoMaker

PhotoMaker

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8681Issues:0Issues:0

DiffiT

[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

Stargazers:406Issues:0Issues:0

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27344Issues:0Issues:0

TinyGPT-V

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Language:PythonLicense:BSD-3-ClauseStargazers:1223Issues:0Issues:0

AnyText

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Language:PythonLicense:Apache-2.0Stargazers:4052Issues:0Issues:0

GPTQ-triton

GPTQ inference Triton kernel

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:270Issues:0Issues:0

Skeleton

Skeleton: A Dead Simple, Responsive Boilerplate for Mobile-Friendly Development

Language:CSSLicense:MITStargazers:19043Issues:0Issues:0

improved_edm

Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"

Language:PythonLicense:MITStargazers:85Issues:0Issues:0

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookStargazers:10520Issues:0Issues:0

minSDXL

Huggingface-compatible SDXL Unet implementation that is readily hackable

Language:Jupyter NotebookStargazers:366Issues:0Issues:0

T2I-Adapter-for-Diffusers

Transfer the T2I-Adapter with any basemodel in diffusers🔥

License:MITStargazers:131Issues:0Issues:0

ControlNet-for-Diffusers

Transfer the ControlNet with any basemodel in diffusers🔥

Language:PythonLicense:MITStargazers:784Issues:0Issues:0

Lora-for-Diffusers

The most easy-to-understand tutorial for using LoRA (Low-Rank Adaptation) within diffusers framework for AI Generation Researchers🔥

Language:PythonLicense:MITStargazers:736Issues:0Issues:0

exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Language:PythonLicense:MITStargazers:3266Issues:0Issues:0

piper

A fast, local neural text to speech system

Language:C++License:MITStargazers:5188Issues:0Issues:0

RVC-Studio

The best looking and most functional webui for RVC related tasks. See website for UI demo:

Language:PythonLicense:MITStargazers:164Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:17822Issues:0Issues:0

wav2lip-hq-updated-ESRGAN

Updated fork of wav2lip-hq allowing for the use of current ESRGAN models

Language:PythonStargazers:46Issues:0Issues:0

SadTalker-Video-Lip-Sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。

Language:PythonStargazers:1739Issues:0Issues:0

SadTalker

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Language:PythonLicense:NOASSERTIONStargazers:11250Issues:0Issues:0