fseasy

followers

following

stars

surreal

深圳

https://blog.fseasy.top

Wei Xu's starred repositories

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookMIT11204 96 337

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11040 163 224

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.010615 122 207

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonMIT7977 150 533

jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Language:PythonNOASSERTION7721 302 260

denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Language:PythonMIT7648 32 284

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.07444 97 1502

point-e

Point cloud diffusion for 3D model synthesis

Language:PythonMIT6424 224 85

guided-diffusion

Language:PythonMIT5937 143 137

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Language:PythonNOASSERTION5210 74 194

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonNOASSERTION4780 54 131

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Language:PythonNOASSERTION2634 30 52

aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Language:PythonAGPL-3.02451 73 209

Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Language:PythonMIT1266 36 703

SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

Language:PythonNOASSERTION1126 63 195

deepl-python

Official Python library for the DeepL language translation API.

Language:PythonMIT1076 21 100

VideoGPT

Language:Jupyter NotebookMIT946 23 38

PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA，你的个性化图像动画生成器，利用文本提示将图像变为奇妙的动画

Language:PythonApache-2.0833 19 40

PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Language:PythonMIT327 20 29

espnet_model_zoo

ESPnet Model Zoo

Language:PythonApache-2.0243 13 29

URLExtract

URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.

Language:PythonMIT239 9 94

LIQE

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Language:PythonMIT164 1 20

uroman

Universal Romanizer that can convert any unicode script to roman (latin) script

Language:PerlNOASSERTION132 12 12

a3t

Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

Language:PythonApache-2.083 4 8

DeepForcedAligner

Language:PythonMIT75 9 3

NeMo-speech-data-processor

A toolkit for processing speech data and creating speech datasets

Language:PythonApache-2.067 7 1

audio-inpainting-diffusion

Language:Jupyter NotebookMIT61 7 2

3aransia

Transliteration for languages and dialects

Language:PythonApache-2.040 6 42

CQT_pytorch

Pytorch implementation of the invertible CQT based on Non-stationary Gabor filters

Language:Jupyter Notebook27 3 6

TimeStretching

Pytorch implementation of Time Stretching in Music using an Autoencoder Network

Language:Jupyter Notebook1701