shengzhang0222

shengzhang0222

Geek Repo

Github PK Tool:Github PK Tool

shengzhang0222's starred repositories

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonLicense:MITStargazers:16966Issues:0Issues:0

KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Language:PythonLicense:MITStargazers:474Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:3990Issues:0Issues:0

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Language:PythonLicense:Apache-2.0Stargazers:616Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5653Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20365Issues:0Issues:0

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:361Issues:0Issues:0
Language:PythonStargazers:130Issues:0Issues:0

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1246Issues:0Issues:0

contentvec

speech self-supervised representations

Language:PythonLicense:MITStargazers:439Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonLicense:AGPL-3.0Stargazers:25029Issues:0Issues:0

voice-activity-detection

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Language:PythonLicense:MITStargazers:145Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:5317Issues:0Issues:0

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonLicense:Apache-2.0Stargazers:437Issues:0Issues:0

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Language:PythonLicense:MITStargazers:566Issues:0Issues:0

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Language:PythonStargazers:420Issues:0Issues:0

One-Shot-Voice-Cloning

:relaxed: One Shot Voice Cloning base on Unet-TTS

Language:Jupyter NotebookStargazers:235Issues:0Issues:0

ChineseTtsTflite

Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models Test。安卓离线中文TTS引擎,在TensorflowTTS基础上开发,用于TfLite模型测试。

Language:JavaLicense:Apache-2.0Stargazers:287Issues:0Issues:0

phkit

phoneme toolkit. 好用的音素处理工具箱,包含中文音素、英文音素、文本转拼音、文本正则化等模块。

Language:PythonLicense:MITStargazers:75Issues:0Issues:0

chinese_text_normalization

Chinese text normalization for speech processing

Language:PythonLicense:MITStargazers:612Issues:0Issues:0

ParaGen

ParaGen is a PyTorch deep learning framework for parallel sequence generation.

Language:PythonLicense:NOASSERTIONStargazers:186Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:75Issues:0Issues:0

Speech-Transformer-tf2.0

transformer for ASR-systerm (via tensorflow2.0)

Language:PythonStargazers:113Issues:0Issues:0

score-ensembles-based-SVM

Combine many organs from a plant to predict their species

Language:Jupyter NotebookStargazers:21Issues:0Issues:0

Speaker_Verification_Tencent

Deep Discriminative Embeddings for Duration Robust Speaker Verification

Language:PythonLicense:MITStargazers:19Issues:0Issues:0

antispoofing-features

Code for the paper "Bag of features for voice anti-spoofing"

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

AM-MobileNet1D

The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 architecture and the Additive Margin Softmax (AM-Softmax) loss function.)

Stargazers:1Issues:0Issues:0

delta

DELTA is a deep learning based natural language and speech processing platform.

Language:PythonLicense:Apache-2.0Stargazers:1591Issues:0Issues:0

speaker-recognition-papers

Share some recent speaker recognition papers and their implementations.

Language:PythonStargazers:90Issues:0Issues:0

VoiceprintRecognition-Tensorflow

使用Tensorflow实现声纹识别

Language:PythonLicense:Apache-2.0Stargazers:286Issues:0Issues:0