Liujingxiu23's repositories
ai-audio-datasets-list
This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.
awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
AudioSep
Official implementation of "Separate Anything You Describe"
awesome_LLMs_interview_notes
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
Bert-VITS2
vits2 backbone with multilingual-bert
DeepMIR
Teaching material for the course "Deep Learning for Music Analysis and Generation" I taught at National Taiwan University (2023 Fall)
Diff-BGM
official code for CVPR'24 paper Diff-BGM
HeyGenClone
A simple and open-source analogue of the HeyGen system
INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
lina-speech
lina-speech : linear attention based text-to-speech
lp-music-caps
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
Make-An-Audio-3
Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers
parler-tts
Inference and training library for high-quality TTS models.
Qwen-7B
The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
supervoice-gpt
GPT-style network for phonemization with durations of text
TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
tts-generation-webui
TTS Generation Web UI (Bark, MusicGen, Tortoise)
vampnet
music generation with masked transformers!
vits2_pytorch
unofficial vits2-TTS implementation in pytorch
voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
WavJourney
WavJourney: Compositional Audio Creation with LLMs