VasyaITOne / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mailing list : test Mailing list : test License: CC BY-NC 4.0

Open In Colab

header


Silero VAD


Silero VAD - pre-trained enterprise-grade Voice Activity Detector (also see our STT models).

This repository also includes Number Detector and Language classifier models


Real Time Example
real-time-example.mp4

Key Features


  • High accuracy

    Silero VAD has excellent results on speech detection tasks.

  • Fast

    One audio chunk (30+ ms) takes around 1ms to be processed on a single CPU thread. Using batching or GPU can also improve performance considerably.

  • Lightweight

    JIT model is less than one megabyte in size.

  • General

    Silero VAD was trained on huge corpora that include over 100 languages and it performs well on audios from different domains with various background noise and quality levels.

  • Flexible sampling rate

    Silero VAD supports 8000 Hz and 16000 Hz sampling rates.

  • Flexible chunk size

    Model was trained on audio chunks of different lengths. 30 ms, 60 ms and 100 ms long chunks are supported directly, others may work as well.


Typical Use Cases


  • Voice activity detection for IOT / edge / mobile use cases
  • Data cleaning and preparation, voice detection in general
  • Telephony and call-center automation, voice bots
  • Voice interfaces

Links



Get In Touch


Try our models, create an issue, start a discussion, join our telegram chat, email us, read our news.

Please see our wiki and tiers for relevant information and email us directly.

Citations

@misc{Silero VAD,
  author = {Silero Team},
  title = {Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/snakers4/silero-vad}},
  commit = {insert_some_commit_here},
  email = {hello@silero.ai}
}

About

Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

License:MIT License


Languages

Language:Python 51.5%Language:Jupyter Notebook 48.5%