There are 12 repositories under stt topic.
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
Fast text based video editing, node Electron Os X desktop app, with Backbone front end.
A collection of resources to make a smart speaker
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
Open source speech to text models for Indic Languages
Deepgram Conversational AI demo
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
Talk to ChatGPT in real time using LiveKit
Speech-to-text in Obsidian using OpenAI Whisper
Synchronized Translation for Videos. Video dubbing
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
Speech-to-text and keyboard input captions for OBS.
Live-Transcription (STT) with Whisper PoC
Jarvis Home Automation
VietASR - Vietnamese Automatic Speech Recognition
Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux
A MXNet implementation of Baidu's DeepSpeech architecture