stt

There are 18 repositories under stt topic.

khoj
khoj-ai / khoj
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
semantic-search emacs obsidian-md chat chatgpt ai llm productivity agent self-hosted rag whatsapp-ai offline-llm llamacpp obsidian llama3 image-generation stt assistant research
Language:Python 30972
alphacep / vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
speech-recognition asr voice-recognition speech-to-text android ios raspberry-pi deep-learning deep-neural-networks speech-to-text-android speaker-identification speaker-verification python offline privacy kaldi deepspeech google-speech-to-text vosk stt
Language:Jupyter Notebook 13196
snakers4 / silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
speech-recognition speech-to-text stt asr pretrained-models english german spanish stt-benchmark pytorch colab onnx torch-hub text-to-speech tts-models speech speech-synthesis tts repunctuation capitalization
Language:Jupyter Notebook 5481
jianchang512 / stt
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式
speech speech-recognition speech-to-text stt
Language:Python 3823
pluja / whishper
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
ai audio-to-text golang subtitles sveltekit transcription whisper ui webapp speech-recognition speech-to-text stt web web-whisper
Language:Svelte 2653
coqui-ai / STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
stt speech-to-text tensorflow deep-learning automatic-speech-recognition asr voice-recognition speech-recognition speech-recognizer speech-recognition-api
Language:C++ 2514
pannous / tensorflow-speech-recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
tensorflow speech-recognition neural-network deep-learning stt speech-to-text
Language:Python 2173
coqui-ai / open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
tts stt speech-to-text text-to-speech speech-recognition speech-synthesis speech-processing voice-recognition voice-activity-detection voice-cloning speech-emotion-recognition speech-separation
1357
Speech-AI-Forge
lenML / Speech-AI-Forge
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
chattts ssml tts chattts-forge agent gpt llm text-to-speech colab llama chinese english fish-speech cosyvoice cosy-voice asr stt firered whisper fireredtts
Language:Python 1342
gp.nvim
Robitx / gp.nvim
Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]
copilot neovim nvim speech-to-text whisper vim codeium lua voice llm ollama claude gpt4o gpt-4o sonnet gemini mistral perplexity stt parrot
Language:Lua 1264
R3gm / SoniTranslate
Synchronized Translation for Videos. Video dubbing
audio-processing diarization translation translate-audio translate-video video-dubbing asr automatic-dubbing document-translator dubbing speech-to-text stt subtitle-to-speech text-to-speech tts
Language:Python 1232
neural-maze / ava-whatsapp-agent-course
Meet Ava, the WhatsApp Agent
agent agent-based agentic-workflow agents stt tts vector-database
Language:Python 1170
mkiol / dsnote
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
asr sailfishos stt tts flatpak-applications linux-desktop nmt offline translator machine-translation speech-recognition speech-synthesis speech-to-text text-to-speech translation
Language:C++ 1126
snakers4 / open_stt
Open STT
asr automatic-speech-recognition dataset russian speech-to-text stt
Language:Python 802
VRCWizard / TTS-Voice-Wizard
Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
tts speech-to-text speech-recognition vrchat osc discord free voice vtuber chatbox heart-rate spotify stt text-to-speech
Language:C# 712
evancohen / sonus
:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
speech speech-recognition speech-to-text voice-control stt node hotword-detection keyword-spotting alexa voice-recognition
Language:JavaScript 639
Picovoice / cheetah
On-device streaming speech-to-text engine powered by deep learning
asr automatic-speech-recognition online-speech-recognition speech-recognition speech-to-text streaming-speech-to-text stt transcription voice-recognition
Language:Python 637
bbc / react-transcript-editor
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
bbc-news-labs news-labs transcript transcription transcript-editor stt kaldi react textav
Language:JavaScript 596
lobe-tts
lobehub / lobe-tts
🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser
lobehub tts auzre edge microsoft-speech-api opeanai speech-recognition speech-to-text stt text-to-speech bun nodejs react
Language:TypeScript 587
madroidmaq / mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
function-calling genai mlx openai openai-api structured-output stt tools tts
Language:Python 547
Macoron / whisper.unity
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
asr stt speech-to-text openai speech-recognition whisper unity3d
Language:C# 511
StarmoonAI / Starmoon
A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS 🤖
llm voice-assistant esp32 gpt iot robotics stt tts openai
Language:TypeScript 504
ccoreilly / vosk-browser
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
kaldi vosk wasm webassembly asr stt speech-recognition speech-to-text typescript
Language:JavaScript 487
make-a-smart-speaker
voice-engine / make-a-smart-speaker
A collection of resources to make a smart speaker
voice-assistant stt tts nlu kws aec beamforming
466
Picovoice / leopard
On-device speech-to-text engine powered by deep learning
asr automatic-speech-recognition on-device speech-recognition speech-to-text stt transcription voice-recognition voice-to-text
Language:Python 458
OpenNewsLabs / autoEdit_2
Fast text based video editing, node Electron Os X desktop app, with Backbone front end.
video-editing dmg edl watson speech-to-text stt gentle gentle-stt osx electron backbone ibm-watson ibm-watson-speech mac speechmatics video-sequences autoedit transcription desktop backbonejs
Language:JavaScript 431
gia-guar / JARVIS-ChatGPT
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
ai chat-gpt-3 ibm-watson jarvis-ai openai python tacotron speech-recognition tts chatgpt chatgpt-api elevenlabs pytorch stt
Language:Python 407
deepgram-devs / deepgram-ai-agent-demo
Deepgram Conversational AI demo
asr deepgram nextjs react stt tts vercel
Language:TypeScript 384
Ikaros-521 / RealtimeSTT_LLM_TTS
实时STT，连接OpenAI接口/智谱AI（流式LLM）和GPT-SOVITS/Edge-TTS，通过网页的方式，进行跨网络的服务调用，实现实时对话的效果
llm python stt tts
Language:Python 354
NsLearning / LangHelper
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
ai assessment chatgpt speech speech-recognition speech-to-text ielts language-learning toefl words asr stt tts
Language:Rust 338
Open-Speech-EkStep / vakyansh-models
Open source speech to text models for Indic Languages
stt speech-recognition speech-recognition-model indic-languages
297
nikdanilov / whisper-obsidian-plugin
Speech-to-text in Obsidian using OpenAI Whisper
obsidian whisper openai-whisper speech-to-text stt transcribe voice
Language:TypeScript 260
algolia / voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
voice overlay input speech-to-text stt voice-recognition speech-recognition voice-assistant conversation conversational-ui conversational-interface conversational-bots chatbots permission permissions permissions-android instant-search instantsearch search android
Language:Kotlin 258
livekit-examples / kitt
Talk to ChatGPT in real time using LiveKit
gpt tts stt chatgpt webrtc ai openai assistant voice transcription translation
Language:Go 254
deepgram-starters / nextjs-live-transcription
Get started using Deepgram's Live Transcription with this Next.js demo app
deepgram speech-to-text live stt transcription websocket real-time
Language:TypeScript 194
gaborvecsei / whisper-live-transcription
Live-Transcription (STT) with Whisper PoC
ai applied-machine-learning gradio machine-learning python speach-to-text stt whisper
Language:Python 175

stt

khoj-ai / khoj

alphacep / vosk-api

snakers4 / silero-models

jianchang512 / stt

pluja / whishper

coqui-ai / STT

pannous / tensorflow-speech-recognition

coqui-ai / open-speech-corpora

lenML / Speech-AI-Forge

Robitx / gp.nvim

R3gm / SoniTranslate

neural-maze / ava-whatsapp-agent-course

mkiol / dsnote

snakers4 / open_stt

VRCWizard / TTS-Voice-Wizard

evancohen / sonus

Picovoice / cheetah

bbc / react-transcript-editor

lobehub / lobe-tts

madroidmaq / mlx-omni-server

Macoron / whisper.unity

StarmoonAI / Starmoon

ccoreilly / vosk-browser

voice-engine / make-a-smart-speaker

Picovoice / leopard

OpenNewsLabs / autoEdit_2

gia-guar / JARVIS-ChatGPT

deepgram-devs / deepgram-ai-agent-demo

Ikaros-521 / RealtimeSTT_LLM_TTS

NsLearning / LangHelper

Open-Speech-EkStep / vakyansh-models

nikdanilov / whisper-obsidian-plugin

algolia / voice-overlay-android

livekit-examples / kitt

deepgram-starters / nextjs-live-transcription

gaborvecsei / whisper-live-transcription