Beast code in Giters

chenchen's starred repositories

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.034520 343 2692

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellNOASSERTION14060 696 1641

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.011382 200 2212

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonNOASSERTION9948 131 48

mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Language:PythonApache-2.04256 58 896

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.04045 90 1019

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonNOASSERTION3955 48 841

EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Language:PythonApache-2.02353 42 88

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonApache-2.01763 21 179

WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.

Language:Python1495 17 36

YoutubePlaylistDownloader

A tool to download whole playlists, channels or single videos from youtube and also optionally convert them to almost any format you would like

Language:C#Apache-2.01440 27 218

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonApache-2.01413 23 60

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonMIT1142 26 80

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaApache-2.01105 77 377

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

MIT616 87 4

ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Language:PythonMIT341 28 3

zyjcsf

chenchen's starred repositories

DeepSpeed

kaldi

NeMo

AudioGPT

LWM

mmocr

wenet

FunASR

EasyLM

OpenRLHF

WhisperFusion

YoutubePlaylistDownloader

OpenDiT

SpeechT5

k2

INTERSPEECH-2023-Papers

ICASSP-2023-24-Papers

cyrillic-transliteration

DPHuBERT

transfusion-asr

awesome-asr-contextualization

clairaudience

xlm_to_xlsr

Contextual-Biasing-Dataset

dual_cross_modality-AVSR