Beast code in Giters

ChenWang's starred repositories

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

49100

audiocaps

🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps

Language:PythonMIT12900

vocalsound

Dataset and baseline code for the VocalSound dataset (ICASSP2022).

Language:Jupyter Notebook9600

lp-music-caps

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Language:Python25700

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

MIT40900

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell656400

WavCaps

This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.

Language:Python19000

ChatTTS

A generative speech model for daily dialogue.

Language:PythonAGPL-3.02837100

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonBSD-3-Clause311400

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Language:Python52800