Fu Guanyu (EricFuma)

EricFuma

Geek Repo

Company:AliPay

Location:HangZhou, China

Github PK Tool:Github PK Tool

Fu Guanyu's starred repositories

Promptify

Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3212Issues:0Issues:0

KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Language:PythonLicense:MITStargazers:486Issues:0Issues:0

NeuFA

Neural network-based forced alignment with bidirectional attention mechanism

Language:PythonStargazers:70Issues:0Issues:0

SiFiGAN

Official implementation of the source-filter HiFiGAN vocoder

Language:PythonLicense:MITStargazers:234Issues:0Issues:0

jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Language:PythonLicense:NOASSERTIONStargazers:7782Issues:0Issues:0

grpc

The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)

Language:C++License:Apache-2.0Stargazers:41725Issues:0Issues:0

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonLicense:MITStargazers:2399Issues:0Issues:0

mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Language:PythonLicense:Apache-2.0Stargazers:1159Issues:0Issues:0

VITS-BigVGAN-SpanPSP-Chinese

基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。

Language:PythonStargazers:193Issues:0Issues:0

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Language:PythonLicense:MITStargazers:313Issues:0Issues:0

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Language:PythonLicense:Apache-2.0Stargazers:417Issues:0Issues:0

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonLicense:MITStargazers:3452Issues:0Issues:0

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:14189Issues:0Issues:0

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonLicense:MITStargazers:1828Issues:0Issues:0

TTS-frontend

TTS-frontend with Bert and CRF/lstm (For Tacotron)

Language:PythonStargazers:49Issues:0Issues:0

PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】

Language:PythonLicense:Apache-2.0Stargazers:12692Issues:0Issues:0
Language:PythonStargazers:111Issues:0Issues:0

ltp

Language Technology Platform

Language:PythonStargazers:4941Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19709Issues:0Issues:0

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonLicense:Apache-2.0Stargazers:456Issues:0Issues:0

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonLicense:Apache-2.0Stargazers:32608Issues:0Issues:0

paper2gui

Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术

Language:Jupyter NotebookLicense:MITStargazers:10200Issues:0Issues:0

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Language:Jupyter NotebookLicense:MITStargazers:1547Issues:0Issues:0

chinese_text_normalization

Chinese text normalization for speech processing

Language:PythonLicense:MITStargazers:621Issues:0Issues:0

tldraw

SDK for creating whiteboards and canvas experiences on the web.

Language:TypeScriptLicense:NOASSERTIONStargazers:35413Issues:0Issues:0

speech_dataset

The dataset of Speech Recognition

License:Apache-2.0Stargazers:383Issues:0Issues:0

chinese_speech_pretrain

chinese speech pretrained models

Language:ShellStargazers:1014Issues:0Issues:0
Language:PythonLicense:MITStargazers:1371Issues:0Issues:0

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:10990Issues:0Issues:0

forced-alignment-tools

A collection of links and notes on forced alignment tools

Language:PythonLicense:NOASSERTIONStargazers:868Issues:0Issues:0