MingjieChen

mingjie chen's starred repositories

LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.033872 209 5185

RTranslator

Open source real-time translation app for Android that runs locally

Language:C++Apache-2.06770 50 64

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.06119 58 485

SenseVoice

Multilingual Voice Understanding Model

Language:PythonNOASSERTION3339 38 132

mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Language:PythonMIT3063 97 110

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonNOASSERTION2717 37 137

LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Language:PythonApache-2.02527 28 46

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Language:Jupyter NotebookAGPL-3.02523 34 47

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Language:PythonMIT2037 37 106

conversational-datasets

Large datasets for conversational AI

Language:PythonApache-2.01294 74 30

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:Python1200 31 79

mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Language:PythonMIT981 18 68

STAR-Adapt

Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

Language:Python283 2 2

bc-omni

Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊

224 13 1

MooER: Moore-threads Open Omni model for spech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

Language:PythonNOASSERTION147 5 13

EmoBox

[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Language:Python140 4 1

Parameter-Efficient-MoE

Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks

Language:PythonApache-2.0129 4 8

AudioLLM

Audio Large Language Models

122 40

RSTnet

Real-time Speech-Text Foundation Model Toolkit (wip)

Language:Python119 12 2

GigaSpeech2

An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement

Language:PythonApache-2.0114 6 8

SummaryMixing

This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).

Language:PythonNOASSERTION111 10 3

Emotion-LLaMA

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Language:PythonBSD-3-Clause102 5 19

EmoLLMs

Language:PythonMIT49 2 2

simul_whisper

Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Language:Python44 3 2

Dasheng

Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"

Language:PythonApache-2.043 3 3

emotional-speech-annotations

This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models

Apache-2.028 40

speech-to-speech

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

Language:Python28 20

TS-Whisper

Language:Python20 1 1

ConversationalDataset

All benchmarks related to conversations

Language:Jupyter Notebook400