wzy's starred repositories
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Megatron-LM
Ongoing research training transformer models at scale
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
agentscope
Start building LLM-empowered multi-agent applications in an easier way.
whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
whisper_real_time
Real time transcription with OpenAI Whisper.
whisper-asr-webservice
OpenAI Whisper ASR Webservice API
transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
SwiftInfer
Efficient AI Inference & Serving
WeTextProcessing
Text Normalization & Inverse Text Normalization
ContextualSP
Multiple paper open-source codes of the Microsoft Research Asia DKI group
ModelCenter
Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
tagger_rewriter
对话改写介绍文章
keyword-spot
端到端语音唤醒工具箱,从模型训练到模型推理。