xuexidi's repositories

FastSpeech2-1

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

License:MITStargazers:0Issues:0Issues:0

glow-tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

License:MITStargazers:0Issues:0Issues:0

ERISHA

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

License:NOASSERTIONStargazers:0Issues:0Issues:0

CMIN_moment_retrieval

Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos

Stargazers:0Issues:0Issues:0

Resemblyzer

A python package to analyze and compare voices with deep learning

License:Apache-2.0Stargazers:0Issues:0Issues:0

Hadamard-Matrix-for-hashing

CVPR2020: Central Similarity Quantization/Hashing for Efficient Image and Video Retrieval

License:MITStargazers:0Issues:0Issues:0

athena

an open-source implementation of sequence-to-sequence based speech processing engine

License:Apache-2.0Stargazers:0Issues:0Issues:0

deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

License:NOASSERTIONStargazers:0Issues:0Issues:0

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

License:MITStargazers:0Issues:0Issues:0

text_enhancement

基于lasertagger生成文本,用于文本复述和数据增

Stargazers:0Issues:0Issues:0

zhrtvc

Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。

Stargazers:0Issues:0Issues:0

zhvoice

Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。

Stargazers:0Issues:0Issues:0

tacotronv2_wavernn_chinese

tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)

Stargazers:0Issues:0Issues:0

mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

Tacotron_VAE

Multi-Speaker Tacotron2 with VAE

Stargazers:0Issues:0Issues:0

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

License:NOASSERTIONStargazers:0Issues:0Issues:0

autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

License:MITStargazers:0Issues:0Issues:0

style-token_tacotron2

style token with tacotron2

License:MITStargazers:0Issues:0Issues:0

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

License:NOASSERTIONStargazers:0Issues:0Issues:0

SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck

License:MITStargazers:1Issues:0Issues:0

melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

License:MITStargazers:0Issues:0Issues:0

melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

License:MITStargazers:0Issues:0Issues:0

melgan-1

MelGAN implementation with Multi-Band and Full Band supports...

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

Awesome-EmbodiedAI

A curated list about Awesome Embodied AI works and is still in construct. Now it contains a list of Simulators, Tasks and Datasets.

License:MITStargazers:0Issues:0Issues:0

DurIAN

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

video-keyframe-detector

It is a simple python tool to extract key-frames from a video file using peak estimation from frame difference.

License:GPL-3.0Stargazers:0Issues:0Issues:0

ac-ppo

Actor-Critic and openAI clipped PPO in gym cartpole-v0 and pendulum-v0 environment

Stargazers:0Issues:0Issues:0

DurIAN-1

Implementation of "DurIAN: Duration Informed Attention Network For Multimodal Synthesis".

Stargazers:0Issues:0Issues:0

WaveRNN

WaveRNN Vocoder + TTS

License:MITStargazers:0Issues:0Issues:0