macroustc

macroustc

Geek Repo

Github PK Tool:Github PK Tool

macroustc's repositories

visual-chatgpt

Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

License:MITStargazers:0Issues:0Issues:0

ChatPaper

Use ChatGPT to summarize the arXiv papers.

License:NOASSERTIONStargazers:0Issues:0Issues:0

SadTalker

(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

License:MITStargazers:0Issues:0Issues:0

awesome-chatgpt-prompts-zh

ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。

License:MITStargazers:0Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder archi

License:NOASSERTIONStargazers:0Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

License:MITStargazers:0Issues:0Issues:0

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

License:NOASSERTIONStargazers:0Issues:0Issues:0

vits_chinese

Best TTS based on BERT and VITS with some Natural Speech Features Of Microsoft

Stargazers:0Issues:0Issues:0

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

License:MITStargazers:0Issues:0Issues:0

naturalspeech

A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)

Stargazers:0Issues:0Issues:0

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

License:MITStargazers:0Issues:0Issues:0

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

License:NOASSERTIONStargazers:0Issues:0Issues:0

voxceleb_trainer

In defence of metric learning for speaker recognition

License:MITStargazers:0Issues:0Issues:0

nnsvs

Neural network-based singing voice synthesis library for research

License:MITStargazers:0Issues:0Issues:0

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

License:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

License:MITStargazers:0Issues:0Issues:0
License:NOASSERTIONStargazers:0Issues:0Issues:0

noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

License:MITStargazers:0Issues:0Issues:0

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

LIA

[ICLR 22] Latent Image Animator: Learning to Animate Images via Latent Space Navigation

License:NOASSERTIONStargazers:0Issues:0Issues:0

FastASR

基于PaddleSpeech所使用的conformer模型,使用C++的高效实现模型推理,在树莓派4B等ARM平台运行也可流畅运行。

License:Apache-2.0Stargazers:0Issues:0Issues:0

Fay

语音互动,直播自动带货 虚拟数字人

License:GPL-3.0Stargazers:1Issues:0Issues:0

chinese_speech_pretrain

chinese speech pretrained models

Stargazers:0Issues:0Issues:0

LIHQ

Long-Inference, High Quality Synthetic Speaker

Stargazers:0Issues:0Issues:0

DeepFaceLive

Real-time face swap for PC streaming or video calls

License:GPL-3.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

License:Apache-2.0Stargazers:0Issues:0Issues:0

StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

License:MITStargazers:0Issues:0Issues:0

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

License:Apache-2.0Stargazers:0Issues:0Issues:0