holdurhorses

holdurhorses

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

holdurhorses's starred repositories

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:34977Issues:309Issues:876

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:33606Issues:282Issues:1097

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:26345Issues:725Issues:0

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8597Issues:133Issues:1080

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8317Issues:182Issues:2351

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:4011Issues:50Issues:228

TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Language:PythonLicense:Apache-2.0Stargazers:3806Issues:78Issues:684

STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Language:C++License:MPL-2.0Stargazers:2240Issues:62Issues:183

athena

an open-source implementation of sequence-to-sequence based speech processing engine

Language:C++License:Apache-2.0Stargazers:949Issues:37Issues:137

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Language:PythonLicense:Apache-2.0Stargazers:677Issues:18Issues:109

chinese_text_normalization

Chinese text normalization for speech processing

Language:PythonLicense:MITStargazers:620Issues:15Issues:13

CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Language:C++License:MITStargazers:555Issues:19Issues:68

ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

wekws

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Language:PythonLicense:Apache-2.0Stargazers:440Issues:17Issues:72

g2pC

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese

Language:PythonLicense:Apache-2.0Stargazers:237Issues:9Issues:9

awesome-keyword-spotting

This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).

License:MITStargazers:237Issues:11Issues:0

prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Language:PythonLicense:MITStargazers:229Issues:12Issues:4

Listen-Attend-Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

voice-activity-detection

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Language:PythonLicense:MITStargazers:148Issues:5Issues:5

torch-mfcc

A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.

Language:PythonLicense:MITStargazers:72Issues:2Issues:2

Prosody_Prediction

Predict prosody labels for Chinese sentences.

E2E_ASR_Confidence_Estimation

Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"

Chinese_PSP

Chinese Prosodic Structure Prediction

Language:PythonLicense:MITStargazers:10Issues:1Issues:2

is2021_feature_extractor_v2

Instead of posterior probability of recognized tokens, we use GOP scores as the token's confidence scores

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

DeepLearning-500-questions

深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06

License:GPL-3.0Stargazers:2Issues:0Issues:0

Attention-Confidence

Attention mechanism for the estimation of confidence scores

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

kaldi-hybrid-decoder

In Automatic Speech Recognition(ASR), decoder is either static(based on Weighted Finite State Transducer) or dynamic(based on History Conditioned Word Prefix-Tree/Graph). This project provides a unified approach in Kaldi's framework, extending its decoder for more application scenarios.

Stargazers:1Issues:0Issues:0