symao (Maoshuiyang)

Maoshuiyang

Geek Repo

Company:The Chinese University of Hong Kong

Location:Hong Kong

Github PK Tool:Github PK Tool

symao's repositories

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Stargazers:0Issues:0Issues:0

DL-Demos

Demos for deep learning

Stargazers:0Issues:0Issues:0

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

License:MITStargazers:0Issues:0Issues:0

make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

License:MITStargazers:0Issues:0Issues:0

vits-piper

A fast, local neural text to speech system

License:MITStargazers:0Issues:0Issues:0

SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

License:MITStargazers:0Issues:0Issues:0

SadTalker-Video-Lip-Sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。

Stargazers:0Issues:0Issues:0

CharsiuG2P

Multilingual G2P in 100 languages

License:MITStargazers:0Issues:0Issues:0

gruut

A tokenizer, text cleaner, and phonemizer for many human languages.

License:MITStargazers:0Issues:0Issues:0

DL-Art-School

TorToiSe fine-tuning with DLAS

License:AGPL-3.0Stargazers:0Issues:0Issues:0

naturalspeech

A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)

Stargazers:0Issues:0Issues:0

Diffsound

The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"

Stargazers:0Issues:0Issues:0

vits-cantonese

Cantonese Text to Speech with VITS implementation

License:MITStargazers:0Issues:0Issues:0

TranSpeech

PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation

License:MITStargazers:0Issues:0Issues:0

phonemizer

Simple text to phones converter for multiple languages

License:GPL-3.0Stargazers:0Issues:0Issues:0

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

License:MITStargazers:0Issues:0Issues:0

lyra

A Very Low-Bitrate Codec for Speech Compression

License:Apache-2.0Stargazers:0Issues:0Issues:0

WMSeg-upgrade

This is the implementation of Improving Chinese Word Segmentation with Wordhood Memory Networks at ACL2020.

License:MITStargazers:0Issues:0Issues:0

cmake-demo

《CMake入门实战》源码

Stargazers:0Issues:0Issues:0

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

License:MITStargazers:0Issues:0Issues:0

chinese_speech_pretrain

chinese speech pretrained models

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

mmdetection-to-tensorrt

convert mmdetection model to tensorrt, support fp16, int8, batch input, dynamic shape etc.

License:Apache-2.0Stargazers:0Issues:0Issues:0

LibtorchTutorials

This is a code repository for pytorch c++ (or libtorch) tutorial.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Pytorch-Memory-Utils

pytorch memory track code

Stargazers:0Issues:0Issues:0

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

License:MITStargazers:0Issues:0Issues:0

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

License:Apache-2.0Stargazers:0Issues:0Issues:0

regnet

Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned Sound (VAS) dataset.

Stargazers:0Issues:0Issues:0