zhongshijun

followers

following

stars

happlydata's repositories

SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

NOASSERTION000

SEMamba

This is the official implementation of the SEMamba paper.

000

GaussianTalker

NOASSERTION000

midifile

C++ classes for reading/writing Standard MIDI Files

BSD-2-Clause000

MIDI-BERT

This is the official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

MIT000

Awesome-Talking-Face

📖 A curated list of resources dedicated to talking face.

MIT000

DiffSpeaker

This is the official repository for DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer

000

inferno

🔥🔥🔥 Set the world of 3D faces on fire with INFERNO 🔥🔥🔥

NOASSERTION000

voxangeles

VoxAngeles Corpus

000

BYOC

[IEEE-VR 2024] Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

000

NKF_train

NKF training

000

TCN-beat-tracker-pytorch

PyTorch implementation of "Temporal convolutional networks for musical audio beat tracking"

000

pretty-midi

Utility functions for handling MIDI data in a nice/intuitive way.

MIT000

ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache-2.0000

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

MIT000

FaceDiffuser

NOASSERTION000

resemble-enhance

AI powered speech denoising and enhancement

MIT000

real-time-lyrics-alignment

Codebase for 'A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance', ICASSP 2024

NOASSERTION000

RMVPE

Apache-2.0000

pesto

Self-supervised learning for fast pitch estimation

LGPL-3.0000

gtcrn

An official implementation of GTCRN, an ultra-lite speech enhancement model.

000

RUI_SE

The official repo of "A Refining Underlying Information Framework for Speech Enhancement"

000

deepvqe

An unofficial implementation of DeepVQE proposed by Microsoft Corp.

000

DJCM

000

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

MIT000

facialanimation

Source code for: Expressive Speech-driven Facial Animation with controllable emotions

Apache-2.0000

BEAT

BEAT huawei 3D dataset

000

CoMoSVC

CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone

000

NeuCoSVC

000

pitch-detection

autocorrelation-based O(NlogN) pitch detection

MIT000