AlexYangli's repositories

EasySpider

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

License:NOASSERTIONStargazers:0Issues:0Issues:0

StoryDiffusion

Create Magic Story!

License:Apache-2.0Stargazers:0Issues:0Issues:0

ESLTTS

ESLTTS dataset

License:MITStargazers:0Issues:0Issues:0

champ

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

License:Apache-2.0Stargazers:0Issues:0Issues:0

metahuman-stream

Real time streaming digital human based on nerf

License:MITStargazers:0Issues:0Issues:0

weekly

科技爱好者周刊,每周五发布

Stargazers:0Issues:0Issues:0

Singing-Voice-Conversion

Project of Singing Voice Conversion.

Stargazers:0Issues:0Issues:0

Bert-VITS2

vits2 backbone with bert

License:AGPL-3.0Stargazers:0Issues:0Issues:0

Arabic-Tashkeela-Model

This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on Kaggle

Stargazers:0Issues:0Issues:0

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

License:Apache-2.0Stargazers:0Issues:0Issues:0

voicefixer

General Speech Restoration

License:MITStargazers:0Issues:0Issues:0

WikipediaHomographData

Labeled data for homograph disambiguation

License:Apache-2.0Stargazers:0Issues:0Issues:0

english-conversation-corpus

English conversation corpus for conversational TTS.

License:GPL-3.0Stargazers:0Issues:0Issues:0

g2p_id

g2p ID: Indonesian Grapheme-to-Phoneme Converter

License:Apache-2.0Stargazers:0Issues:0Issues:0

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

License:MITStargazers:0Issues:0Issues:0

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

License:NOASSERTIONStargazers:0Issues:0Issues:0

NeMo

NeMo: a toolkit for conversational AI

License:Apache-2.0Stargazers:0Issues:0Issues:0

sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

speechbrain

A PyTorch-based Speech Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

License:GPL-3.0Stargazers:0Issues:0Issues:0

low-resource-languages

Resources for conservation, development, and documentation of low resource (human) languages.

License:CC-BY-SA-4.0Stargazers:0Issues:0Issues:0

a-week-in-wild-ai

360 view on ai/ml/dl applications

License:MITStargazers:0Issues:0Issues:0

performant

A toolset for easy formant extraction and visualization from wav files and TTS models

License:MITStargazers:0Issues:0Issues:0

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Stargazers:0Issues:0Issues:0

ParaLip

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Stargazers:1Issues:0Issues:0

Latent-GLAT

Implementation of latent-GLAT (ACL-2022)

Stargazers:0Issues:0Issues:0

rVADfast

This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

License:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0