hyzhan

hyzhan

Geek Repo

Location:Guangzhou

Github PK Tool:Github PK Tool

hyzhan's repositories

auraloss

Collection of audio-focused loss functions in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

g2pM

A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset

License:Apache-2.0Stargazers:0Issues:0Issues:0

gcn

Implementation of Graph Convolutional Networks in TensorFlow

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

License:MITStargazers:0Issues:0Issues:0

GPT2-Chinese

Chinese version of GPT2 training code, using BERT tokenizer.

License:MITStargazers:0Issues:0Issues:0

grafx

GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch

Stargazers:0Issues:0Issues:0

hyzhan.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

Interspeech2021

Interspeech2021

Language:HTMLStargazers:0Issues:2Issues:0

lightconv_pt

lightconv_layer fairseq

Stargazers:0Issues:0Issues:0

Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

NVIDIA_SGEMM_PRACTICE

Step-by-step optimization of CUDA SGEMM

Stargazers:0Issues:0Issues:0

phonological-features

Materials accompanying the paper "Phonological features for 0-shot multilingual speech synthesis"

Language:PythonStargazers:0Issues:1Issues:0

PyTorch-BigGraph

Software used for generating embeddings from large-scale graph-structured data.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

spleeter

Deezer source separation library including pretrained models.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

StyleDubber

[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"

License:MITStargazers:0Issues:0Issues:0

TTS_TFLite

This repository is a collection of TTS Models in TFLite

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vits_chinese

Best TTS based on BERT and VITS with some Natural Speech Features Of Microsoft

Stargazers:0Issues:0Issues:0

voice-filter

A unofficial Pytorch implementation of Google's VoiceFilter

Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:0Issues:0

w2v2-how-to

How to use our public wav2vec2 dimensional emotion model

License:MITStargazers:0Issues:0Issues:0

waveglow

A Flow-based Generative Network for Speech Synthesis

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

WaveRNN-Pytorch

Fatcord's Alternative WaveRNN (Faster training)

License:MITStargazers:0Issues:0Issues:0