SUPER-ALEX

followers

following

stars

AlexYangli's repositories

EasySpider

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

NOASSERTION000

StoryDiffusion

Create Magic Story!

Apache-2.0000

ESLTTS

ESLTTS dataset

MIT000

champ

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Apache-2.0000

metahuman-stream

Real time streaming digital human based on nerf

MIT000

weekly

科技爱好者周刊，每周五发布

000

Singing-Voice-Conversion

Project of Singing Voice Conversion.

000

Bert-VITS2

vits2 backbone with bert

AGPL-3.0000

Arabic-Tashkeela-Model

This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on Kaggle

000

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Apache-2.0000

voicefixer

General Speech Restoration

MIT000

WikipediaHomographData

Labeled data for homograph disambiguation

Apache-2.0000

english-conversation-corpus

English conversation corpus for conversational TTS.

GPL-3.0000

g2p_id

g2p ID: Indonesian Grapheme-to-Phoneme Converter

Apache-2.0000

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

MIT000

MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

NOASSERTION000

NeMo

NeMo: a toolkit for conversational AI

Apache-2.0000

sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

Apache-2.0000

Papers

000

speechbrain

A PyTorch-based Speech Toolkit

Apache-2.0000

espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

GPL-3.0000

low-resource-languages

Resources for conservation, development, and documentation of low resource (human) languages.

CC-BY-SA-4.0000

a-week-in-wild-ai

360 view on ai/ml/dl applications

MIT000

performant

A toolset for easy formant extraction and visualization from wav files and TTS models

MIT000

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

000

ParaLip

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

100

Latent-GLAT

Implementation of latent-GLAT (ACL-2022)

000

rVADfast

This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

MIT000

NAT

Language:PythonMIT000

SMART-Single_Emotional_TTS

GPL-3.0000