syang1993

Shan Yang's starred repositories

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookNOASSERTION1043400

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Language:PythonApache-2.02423100

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.03524000

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter Notebook1011200

CLAP

Contrastive Language-Audio Pretraining

Language:PythonCC0-1.0121600

Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Language:PythonApache-2.0145100

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT1998400

fma

FMA: A Dataset For Music Analysis

Language:Jupyter NotebookMIT216300

SpeechGPT

SpeechGPT Series: Speech Large Language Models

Language:PythonApache-2.098700

Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。

Language:PythonApache-2.0393900

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonMIT15700

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonApache-2.01038800