There are 5 repositories under vad topic.
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
faster_whisper GUI with PySide6
Voice Activity Detection based on Deep Learning & TensorFlow
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
ICASSP 2023 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
A statistical model-based Voice Activity Detection
Python bindings of WebRTC Audio Processing
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
An Tensorflow Re-Implement of CVPR 2019 "Object-centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video"
webrtc中apm相关代码的提取,包括AEC/NS/AGC/VAD ,另外还包括mp3/aac编码器、SoundTouch
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Voice Activity Detection LSTM-RNN learning model
Karaoke Player / Editor with automatic clip creation from any song file using vocals and lyrics extraction (Speech-to-Text)
Spokestack: give your iOS app a voice interface!
PyTorch implementation of automatic speech recognition models.
A voice activity detection (VAD) library for Unity.
This is FreeSwitch module that can do VAD and ASR with IFLYTEK websocket api.
HadreamAssistant, 你的智能家居/自定义语音助手, 支持树莓派/Linux
Voice activity detection (VAD) library, based on WebRTC's VAD engine built to WASM with Emscripten to run in browsers, Node, and NativeScript
silero-vad + whisper.cpp for ROS 2 (speech-to-text for ROS 2)
OpenVoiceOS Voice Satellite