chenchy

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Language:Python000

FineDance

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation. (ICCV2023)

NOASSERTION000

goct_ismir2023

code for "BEAT-ALIGNED SPECTROGRAM-TO-SEQUENCE GENERATION OF RHYTHM-GAME CHARTS" (ISMIR 2023)

Language:Jupyter Notebook000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonNOASSERTION000

JEN-1-pytorch

Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.04729)

Language:Python000

languagecodec

Official code repository of Language-Codec

MIT000

LLVC

Language:PythonMIT000

LODGE

The code the CVPR2024 paper Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

000

M2UGen

This is the official repository for M2UGen

Language:Jupyter NotebookMIT000

MahaTTS

Apache-2.0000

MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

MIT000

MQTTS

Language:PythonMIT000

NeuCoSVC

Language:Python000

PAM

PAM is a no-reference audio quality metric for audio generation tasks

MIT000

PhotoMaker

Language:Jupyter NotebookNOASSERTION000

pinyin-to-ipa

Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.

MIT000

rule-guided-music

Language:Python000

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:CNOASSERTION000

snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

MIT000

song-describer-dataset

The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.

Language:Jupyter NotebookMIT000

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:PythonNOASSERTION000