Beast code in Giters

saber5433's repositories

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python000

ai副业赚钱大集合，教你如何利用ai做一些副业项目，赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English version for more insights.

000

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

MIT000

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

MIT000

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

MIT000

AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Language:PythonNOASSERTION000

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonAGPL-3.0000

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

NOASSERTION000

edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

GPL-3.0000

ER-NeRF

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

MIT000

GeneFacePlusPlus

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

Language:Python000

Genshin_Datasets

Genshin Datasets For SVC/SVS/TTS

000

hifi-gan-bwe

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

MIT000

libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Apache-2.0000

Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

MIT000

megatts2

Unoffical implementation of Megatts2

MIT000

minbpe

Minimal, clean, code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

MIT000

so-vits-svc-5.0

Core Engine of Singing Voice Conversion & Singing Voice Clone

MIT000

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Apache-2.0000