wzy's starred repositories

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:28195Issues:168Issues:408

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27574Issues:209Issues:212

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11052Issues:202Issues:2163

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10583Issues:141Issues:338

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:9498Issues:160Issues:617

Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

Language:C++License:MPL-2.0Stargazers:7809Issues:84Issues:218

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7663Issues:89Issues:1627
Language:PythonLicense:Apache-2.0Stargazers:7029Issues:66Issues:68

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6383Issues:60Issues:78

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6178Issues:37Issues:884

agentscope

Start building LLM-empowered multi-agent applications in an easier way.

Language:PythonLicense:Apache-2.0Stargazers:3738Issues:28Issues:101

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:2514Issues:46Issues:154

whisper_real_time

Real time transcription with OpenAI Whisper.

whisper-asr-webservice

OpenAI Whisper ASR Webservice API

Language:PythonLicense:MITStargazers:1901Issues:27Issues:149

yarn

YaRN: Efficient Context Window Extension of Large Language Models

Language:PythonLicense:MITStargazers:1268Issues:14Issues:55

transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

Language:PythonLicense:GPL-3.0Stargazers:640Issues:7Issues:13

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonLicense:Apache-2.0Stargazers:553Issues:9Issues:36

sql-eval

Evaluate the accuracy of LLM generated outputs

Language:PythonLicense:Apache-2.0Stargazers:477Issues:9Issues:17

SwiftInfer

Efficient AI Inference & Serving

Language:PythonLicense:Apache-2.0Stargazers:447Issues:5Issues:6

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonLicense:Apache-2.0Stargazers:426Issues:11Issues:105

ContextualSP

Multiple paper open-source codes of the Microsoft Research Asia DKI group

Language:PythonLicense:MITStargazers:369Issues:16Issues:33
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:354Issues:5Issues:2

ModelCenter

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain

Language:PythonLicense:Apache-2.0Stargazers:225Issues:7Issues:19
Language:TypeScriptLicense:Apache-2.0Stargazers:212Issues:6Issues:14

MAC-SQL

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

KeSpeech

The repo provides information about KeSpeech dataset.

tagger_rewriter

对话改写介绍文章

pythaiasr

Python Thai Automatic Speech Recognition

Language:PythonLicense:Apache-2.0Stargazers:59Issues:6Issues:11

keyword-spot

端到端语音唤醒工具箱,从模型训练到模型推理。

Language:PythonLicense:MITStargazers:52Issues:0Issues:0

pai

极简 RPA 框架,包括 Server,Agent,Web,Schedule,DB 等组件

Language:PythonLicense:MITStargazers:3Issues:1Issues:0