yzyouzhang

followers

following

stars

University of Rochester

NY, US

https://yzyouzhang.com

Organizations

AirLabUR

You Zhang's starred repositories

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.032272 273 1068

paper-reading

深度学习经典、新论文逐段精读

Apache-2.025157 7060

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT19253 297 1340

gdrive

Google Drive CLI Client

Language:GoMIT8995 223 594

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonMIT7974 150 532

latexify_py

A library to generate LaTeX expression from Python code.

Language:PythonApache-2.07112 55 82

arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Language:PythonApache-2.05058 31 52

improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models

Language:PythonMIT3053 124 127

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language:PythonMIT1885 40 43

Awesome-Implicit-NeRF-Robotics

A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites

AD-NeRF

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Language:PythonMIT1009 16 138

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonMIT872 23 32

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonNOASSERTION510 34 27

Speech-Resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

DFRF

[ECCV2022] The implementation for "Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis".

Language:PythonMIT335 10 37

pt-dec

PyTorch implementation of DEC (Deep Embedding Clustering)

Language:PythonMIT288 6 12

if-sad-send-cat

🐱 A program that sends cats to my phone when I'm sad at the computer.

Language:HTML192 2 1

FRA-RIR

Language:PythonApache-2.0167 8 7

BIRD

Big Impulse Response Dataset

Language:PythonGPL-3.0136 9 2

SeqDeepFake

[ECCV 2022] PyTorch code for SeqDeepFake: Detecting and Recovering Sequential DeepFake Manipulation

Language:Python123 4 9

Skipping-The-Frame-Level

A simple yet effective Audio-to-Midi Automatic Piano Transcription system

Language:PythonMIT77 7 17

s2v_rc

Speech2Vec Reality Check

Language:Python74 2 1

sfs-python

SFS Toolbox for Python

Language:PythonMIT64 14 45

simple-asgan

Training code and trained checkpoints for ASGAN.

Language:Python61 3 3

itsp

Introduction to Speech Processing

Language:Jupyter NotebookCC-BY-SA-4.052 4 4

couta

a time series anomaly detection method based on the calibrated one-class classifier

Language:PythonApache-2.050 2 5

libmpeghe

MPEG-H 3D Audio Low Complexity Profile Encoder. Decoder: https://github.com/ittiam-systems/libmpegh

Language:CBSD-3-Clause-Clear41 4 8

heterogeneous_separation

Code and data recipes for the paper: Heterogeneous Target Speech Separation

Language:PythonMIT3800

LookForTheChange

Code for Look for the Change paper published at CVPR 2022

Language:PythonMIT35 4 6

Signal-Generator

The signal generator is a mex-function for MATLAB that can be used to generate the response of a moving sound source and receiver in a reverberant environment.

Language:C++GPL-3.033 2 1