vad

There are 7 repositories under vad topic.

modelscope / FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
conformer pytorch speech-recognition paraformer punctuation speaker-diarization rnnt audio-visual-speech-recognition pretrained-model voice-activity-detection whisper dfsmn vad speechgpt speechllm
Language:Python 12594
smacke / ffsubsync
Automagically synchronize subtitles with video.
subtitles video audio ffmpeg vad fft synchronization sync subtitle captions vlc vlc-media-player srt srt-subtitles voice-activity-detection speech-detection fast-fourier-transform alignment string-alignment caption
Language:Python 7342
snakers4 / silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
voice-detection voice-recognition voice-commands pytorch onnx voice-activity-detection voice-control onnx-runtime onnxruntime speech speech-processing vad
Language:Python 6837
CheshireCC / faster-whisper-GUI
faster_whisper GUI with PySide6
faster-whisper openai transcribe vad voice-transcription whisper whisperx asr
Language:Python 2652
k2-fsa / sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
python speech-recognition cpp asr c csharp go kotlin vad voice-activity-detection
Language:C++ 1487
jtkim-kaist / VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
vad dnn lstm bdnn acam attention speech data voice-detection speech-recognition voice-activity-detection speech-activity-detection
Language:MATLAB 863
amsehili / auditok
An audio/acoustic activity detection and audio segmentation tool
audio-activities audio-data audio-segmentation voice-detection vad voice-activity-detection
Language:Python 803
DmitryRyumin / ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
asr denoising domain-adaptation face-recognition icassp icassp2023 keyword-spotting language-modeling self-supervised-learning semantic-segmentation signal-processing signal-restoration speech-recognition vad generative-models image-generation music-generation spoken-language-understanding multimodal-learning icassp2024
Language:Python 503
shashikg / WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
asr deep-learning speech-recognition speech-to-text whisper tensorrt-llm tensorrt vad voice-activity-detection
Language:Jupyter Notebook 468
gkonovalov / android-vad
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
vad offline real-time audio-processing gmm webrtc android dnn on-device-ai silero-vad neural-networks speech-detection voice-detection silero deep-neural-networks onnx-models speech-recoginition voice-activity-detector voice-activity-detection yamnet
Language:C 408
RuntimeAudioImporter
gtreshchev / RuntimeAudioImporter
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
unreal-engine audio-converter audio-files audio-formats mp3 blueprints audio plugin ue4-plugin audio-player ue4 ue5 ue5-plugin bink unreal-engine-4 unreal-engine-5 mp3-player vad voice-activity-detection unrealengine
Language:C++ 393
filippogiruzzi / voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
voice-activity-detection deep-learning speech tensorflow time-series time-series-classification resnet speech-recognition speech-detection python mfcc-features machine-learning vad deeplearning artificial-intelligence deep-neural-networks librispeech librispeech-dataset
Language:Python 369
Baidu-AIP / speech-vad-demo
集成Webrtc的VAD，用于切分音频文件
webrtc vad webrtc-vad speech
Language:C 343
EtienneAb3d / WhisperHallu
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
asr audio-processing noise-removal sound-processing text-to-speech vad vocals whisper
Language:Python 319
Picovoice / cobra
On-device voice activity detection (VAD) powered by deep learning
voice-activity-detection speech-recognition vad on-device voice-activity voice-activity-detector
Language:Python 228
eesungkim / Voice_Activity_Detector
A statistical model-based Voice Activity Detection
vad voice-detection voice-activity-detection
Language:Jupyter Notebook 192
xiongyihui / python-webrtc-audio-processing
Python bindings of WebRTC Audio Processing
python vad agc ns webrtc-audio-processing
Language:C++ 188
voithru / voice-activity-detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
voice-activity-detection vad
Language:Python 157
sic
0vercl0k / sic
Enumerate user mode shared memory mappings on Windows.
driver vad prototype-pte shm ntoskrnl shared-memory windows-kernel windows-10
Language:C 118
xia-chu / webrtc_apm
webrtc中apm相关代码的提取，包括AEC/NS/AGC/VAD ，另外还包括mp3/aac编码器、SoundTouch
aac aec agc jni mp3 ns soundtouch vad webrtc
Language:C 99
fjchange / object_centric_VAD
An Tensorflow Re-Implement of CVPR 2019 "Object-centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video"
vad cvpr2019 anomaly
Language:Python 98
asiff00 / On-Device-Speech-to-Speech-Conversational-AI
This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.
asr audio-processing conversational-ai kokoro-tts ollama speech-to-speech tts vad voice-assistant
Language:Python 93
NickWilkinson37 / voxseg
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
speech-processing voice-activity-detection speech-segmentation speech vad python-library python
Language:Python 89
mgonzs13 / whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
ros2 speech-to-text vad voice-activity-detection whisper-cpp speech-recognition ggml asr automatic-speech-recognition whisper
Language:C++ 73
spokestack-android
spokestack / spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
speech-recognition android voice-assistant wakeword asr voice-activity-detection text-to-speech nlu vad speech voice voice-recognition voice-as-an-interface voice-synthesis natural-language-understanding wakeword-activation speech-synthesis speech-api tts
Language:Java 73
EtienneAb3d / karaok-AI
Karaoke Player / Editor with automatic clip creation from any song file using vocals and lyrics extraction (Speech-to-Text)
karaoke mp3-player speech-to-text djing lyrics music party-apps sound-processing srt-subtitles subtitles vad whisper karaoke-maker
Language:Java 70
aria
lef-fan / aria
A local and uncensored AI entity.
ai assistant bot llm python pytorch speech speech-to-text text-to-speech vad deep-learning voice-assistant large-language-models kokoro-tts llamacpp-python localllama tts xttsv2
Language:Python 70
mochi-neko / voice-activity-detection-unity
A voice activity detection (VAD) library for Unity.
unity vad
Language:C# 60
HadreamOrg / HadreamAssistant
HadreamAssistant, 你的智能家居/自定义语音助手, 支持树莓派/Linux
snowboy vad bot talk notion python3 ai
Language:C++ 51
mounalab / LSTM-RNN-VAD
Voice Activity Detection LSTM-RNN learning model
tensorflow lstm-neural-network lstm rnn-tensorflow rnn vad nlp-machine-learning
Language:Python 50
bigcash / awesome-vad
A curated list of awesome voice activity detection
sad speech vad awesome list speech-activity-detection voice-activity-detection
48
spokestack-ios
spokestack / spokestack-ios
Spokestack: give your iOS app a voice interface!
wakeword wakeword-activation asr voice-activity-detection text-to-speech vad natural-language-understanding speech-recognition speech-to-text speech-synthesis speech-processing speech-api hacktoberfest voice-recognition voice-assistant voice-synthesis tensorflow swift ios
Language:Swift 43
shanghaimoon888 / mod_vadasr
This is FreeSwitch module that can do VAD and ASR with IFLYTEK websocket api.
freeswitch asr vad freeswitch-esl freeswitch-plugin
Language:C 38
sooftware / End-to-End-Speech-Recognition-Models
PyTorch implementation of automatic speech recognition models.
deepspeech2 asr las vad voice-activity-detection listen-attend-and-spell transformer e2e end-to-end acoustic-model pytorch
Language:Python 38
baabaaox / go-webrtcvad
WebRTC Voice Activity Detection for Golang
cgo go golang vad webrtc webrtcvad
Language:C 29
OzymandiasTheGreat / libfvad-wasm
Voice activity detection (VAD) library, based on WebRTC's VAD engine built to WASM with Emscripten to run in browsers, Node, and NativeScript
vad wasm webrtc
Language:C 29

vad

modelscope / FunASR

smacke / ffsubsync

snakers4 / silero-vad

CheshireCC / faster-whisper-GUI

k2-fsa / sherpa-ncnn

jtkim-kaist / VAD

amsehili / auditok

DmitryRyumin / ICASSP-2023-24-Papers

shashikg / WhisperS2T

gkonovalov / android-vad

gtreshchev / RuntimeAudioImporter

filippogiruzzi / voice_activity_detection

Baidu-AIP / speech-vad-demo

EtienneAb3d / WhisperHallu

Picovoice / cobra

eesungkim / Voice_Activity_Detector

xiongyihui / python-webrtc-audio-processing

voithru / voice-activity-detection

0vercl0k / sic

xia-chu / webrtc_apm

fjchange / object_centric_VAD

asiff00 / On-Device-Speech-to-Speech-Conversational-AI

NickWilkinson37 / voxseg

mgonzs13 / whisper_ros

spokestack / spokestack-android

EtienneAb3d / karaok-AI

lef-fan / aria

mochi-neko / voice-activity-detection-unity

HadreamOrg / HadreamAssistant

mounalab / LSTM-RNN-VAD

bigcash / awesome-vad

spokestack / spokestack-ios

shanghaimoon888 / mod_vadasr

sooftware / End-to-End-Speech-Recognition-Models

baabaaox / go-webrtcvad

OzymandiasTheGreat / libfvad-wasm