whatissimondoing

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT4492 58 152

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.03799 33 515

V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

Language:Python2199 39 49

OS-Copilot

An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.

Language:PythonMIT1471 21 29

distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Language:PythonApache-2.01452 13 413

Bunny

A family of lightweight multimodal models.

Language:PythonApache-2.0892 19 117

bm25s

Fast lexical search library implementing BM25 in Python using Numpy and Scipy

Language:PythonMIT792 4 24

EmoLLM

心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1

Language:PythonMIT775 4 42

UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Language:Python756 11 9

MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Language:PythonMIT716 6 53

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Language:Python588 15 41