Beast code in Giters

holdurhorses's starred repositories

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION34977 309 876

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.033606 282 1097

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08597 133 1080

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.08317 182 2351

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonMIT4011 50 228

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Language:PythonApache-2.03806 78 684

STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Language:C++MPL-2.02240 62 183

athena

an open-source implementation of sequence-to-sequence based speech processing engine

Language:C++Apache-2.0949 37 137

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Language:PythonApache-2.0677 18 109

chinese_text_normalization

Chinese text normalization for speech processing

Language:PythonMIT620 15 13

CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Language:C++MIT555 19 68

ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

MIT463 13 1

wekws

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Language:PythonApache-2.0440 17 72

g2pC

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese

Language:PythonApache-2.0237 9 9

awesome-keyword-spotting

This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).

MIT237 110

prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Language:PythonMIT229 12 4

Listen-Attend-Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Language:Python200 6 18

voice-activity-detection

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Language:PythonMIT148 5 5

torch-mfcc

A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.

Language:PythonMIT72 2 2

Prosody_Prediction

Predict prosody labels for Chinese sentences.

Language:Python41 5 3

audio_data_augmentation

Language:Python26 30

WFST-decoder-for-phoneme-posterior

Language:Shell22 2 1

E2E_ASR_Confidence_Estimation

Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"

Language:Python15 3 2

Chinese_PSP

Chinese Prosodic Structure Prediction

Language:PythonMIT10 1 2

Audioset_multi_label_classification

Language:Python3 10

is2021_feature_extractor_v2

Instead of posterior probability of recognized tokens, we use GOP scores as the token's confidence scores

Language:PythonMIT2 10

DeepLearning-500-questions

GPL-3.0200

Attention-Confidence

Attention mechanism for the estimation of confidence scores

Language:PythonMIT200

kaldi-hybrid-decoder

In Automatic Speech Recognition(ASR), decoder is either static(based on Weighted Finite State Transducer) or dynamic(based on History Conditioned Word Prefix-Tree/Graph). This project provides a unified approach in Kaldi's framework, extending its decoder for more application scenarios.

100