Liujingxiu23's repositories
AD-NeRF
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
AudioCLIP
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
AudioDVP
AudioDVP:Photorealistic Audio-driven Video Portraits
chinese-audio2face
中文到表情
Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
gender-predicator
Predicting gender of given Chinese names (93~99% test set accuracy). 预测中文姓名的性别(93~99%的测试集准确率)。
HDTF
the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"
iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
KaraSinger
Submitted to ICASSP 2022
Learn2Sing2.0
Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher
libmusicxml
A C/C++ library to support the MusicXML format.
Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
lyrebird-wav2clip
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
Meta-TTS
Official repository of https://arxiv.org/abs/2111.04040v1
Muskits
An opensource music processing toolkit
neural-waveshaping-synthesis
efficient neural audio synthesis in the waveform domain
noisereduce
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)
One-Shot-Voice-Cloning
:relaxed: One Shot Voice Cloning base on Unet-TTS
s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
SegFeat
Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)
StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech
symbolic-music-diffusion
Symbolic Music Generation with Diffusion Models
Text2Video
code for "Text2Video: text-driven talking-head video synthesis with phonetic dictionary"
TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
torchcrepe
Pytorch implementation of the CREPE pitch tracker
VOCANO
VOCANO: A note transcription framework for singing voice in polyphonic music