Queen_Wcy's repositories
espnet_tts_frontend
Text frontend for ESPnet tts recipes
ai-deployment
关注AI模型上线、模型部署
Computer-VisionandAudio-Lab
2018秋哈工大视听觉实验
Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
langid.py
Stand-alone language identification system
LeetCodeAnimation
Demonstrate all the questions on LeetCode in the form of animation.(用动画的形式呈现解LeetCode题目的思路)
line_profiler
(OLD REPO) Line-by-line profiling for Python - Current repo ->
MS-Tacotron2
Tacotron2 based multi-speaker text to speech
papers-with-annotations
Research papers with annotations, illustrations and explanations
PPSpeech
PPSpeech: Phrase based Parallel End-to-End TTS System
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
Shenlan-ASR-Course
深蓝学院语音课程《语音识别从入门到精通》课程作业
speech-synthesis-paper
List of speech synthesis papers.
SqueezeFlow
Code Repository for "SqueezeFlow: Adaptive Text-to-Speech in Low Computational Resource Scenarios"
tacotron2
Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2
Tacotron2_batch_inference
Pytorch tacotron2 that can be used to perform batch inference
TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese and Easy to adapt for other languages)
Voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
WavAugment
A library for speech data augmentation in time-domain
wavegrad
A fast, high-quality neural vocoder.
zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。