Beast code in Giters

Liujingxiu23's repositories

ai-audio-datasets-list

This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.

MIT100

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

100

jukebox-diffusion

Language:PythonMIT100

MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Language:PythonMIT100

audio-pipeline

Language:PythonApache-2.0000

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonMIT000

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型（LLMs）算法工程师相关的面试题和参考答案

MIT000

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonAGPL-3.0000

DeepMIR

Teaching material for the course "Deep Learning for Music Analysis and Generation" I taught at National Taiwan University (2023 Fall)

NOASSERTION000

Diff-BGM

official code for CVPR'24 paper Diff-BGM

000

diffiner

Language:PythonMIT000

HeyGenClone

A simple and open-source analogue of the HeyGen system

Language:Python000

lina-speech

lina-speech : linear attention based text-to-speech

NOASSERTION000

lp-music-caps

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Language:Python000

Make-An-Audio-3

Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers

Language:Python000

NAST-S2x

000

open-tts-tracker

000

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.0000

Qwen-7B

The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION000

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:PythonNOASSERTION000

speech-dataset-generator

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Language:PythonMIT000

supervoice-gpt

GPT-style network for phonemization with durations of text

Language:Python000

supervoice-hybrid

My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one

000

TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonApache-2.0000

tts-generation-webui

TTS Generation Web UI (Bark, MusicGen, Tortoise)

Language:PythonMIT000

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonMIT010

vampnet

music generation with masked transformers!

Language:PythonMIT000

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Language:Jupyter NotebookMIT000

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonMIT000

WavJourney

WavJourney: Compositional Audio Creation with LLMs

000