zhongshijun's repositories

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

AudioSep

Official implementation of "Separate Anything You Describe"

Stargazers:0Issues:0Issues:0

auto-assess-rhythm-imitation

Code for automatic assessment of rhythmic pattern imitations

License:GPL-3.0Stargazers:0Issues:0Issues:0

CodeTalker

[CVPR 2023] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

License:MITStargazers:0Issues:0Issues:0

CoMoSVC

CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone

Stargazers:0Issues:0Issues:0

ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

License:MITStargazers:0Issues:0Issues:0

conformer

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

License:Apache-2.0Stargazers:0Issues:0Issues:0

crepe

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

License:MITStargazers:0Issues:0Issues:0

CRUSE

TOWARDS EFFICIENT MODELS FOR REAL-TIME DEEP NOISE SUPPRESSION

Stargazers:0Issues:0Issues:0

DALL-E

PyTorch package for the discrete VAE used for DALL·E.

License:NOASSERTIONStargazers:0Issues:0Issues:0

deepvqe

An unofficial implementation of DeepVQE proposed by Microsoft Corp.

Stargazers:0Issues:0Issues:0

DiffPitcher

Diffusion-based singing voice pitch correction

Stargazers:0Issues:0Issues:0

Diffusion-Models-Papers-Survey-Taxonomy

Diffusion model papers, survey, and taxonomy

Stargazers:0Issues:0Issues:0

e2e_dnn_ad_control_for_lin_aec

End-To-End Deep Learning-based Adaptation Control for Linear Acoustic Echo Cancellation

License:NOASSERTIONStargazers:0Issues:0Issues:0

easyeffects

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications

License:GPL-3.0Stargazers:0Issues:0Issues:0

EAT_code

Official code for ICCV 2023 paper: "Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation".

Stargazers:0Issues:0Issues:0

gtcrn

An official implementation of GTCRN, an ultra-lite speech enhancement model.

Stargazers:0Issues:0Issues:0

hello-world

Is my first repository.

Stargazers:0Issues:0Issues:0
License:BSD-3-ClauseStargazers:0Issues:0Issues:0

ml-spatial-librispeech

A large synthetic dataset of spatial audio with multiple labels

License:NOASSERTIONStargazers:0Issues:0Issues:0

motion-diffusion-model

The official PyTorch implementation of the paper "Human Motion Diffusion Model"

License:MITStargazers:0Issues:0Issues:0

Motion-X

Official implementation of the paper "Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset"

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

License:GPL-3.0Stargazers:0Issues:0Issues:0

RUI_SE

The official repo of "A Refining Underlying Information Framework for Speech Enhancement"

Stargazers:0Issues:0Issues:0

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

License:MITStargazers:0Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

License:AGPL-3.0Stargazers:0Issues:0Issues:0

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

License:MITStargazers:0Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

Stargazers:0Issues:0Issues:0

webrtcperf

WebRTC performance and quality evaluation tool.

License:AGPL-3.0Stargazers:0Issues:0Issues:0