view1234567's starred repositories
LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
DAMO-ConvAI
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
MambaInLlama
Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
GitHubDaily
坚持分享 GitHub 上高质量、有趣实用的开源技术教程、开发者工具、编程网站、技术资讯。A list cool, interesting projects of GitHub.
VideoLingo
Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
GPT-SoVITS-Inference
Inference Specialization
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
whisper-medusa
Whisper with Medusa heads
llama-cpp-python
Python bindings for llama.cpp
vosk-android-demo
Offline speech recognition for Android with Vosk library.