You Zhang (yzyouzhang)

yzyouzhang

Geek Repo

Company:University of Rochester

Location:NY, US

Home Page:https://yzyouzhang.com

Twitter:@yzyouzhang

Github PK Tool:Github PK Tool


Organizations
AirLabUR

You Zhang's starred repositories

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:32251Issues:273Issues:1068

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:25146Issues:706Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19249Issues:297Issues:1339

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

Language:HTMLLicense:MITStargazers:10512Issues:266Issues:45

gdrive

Google Drive CLI Client

arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Language:PythonLicense:Apache-2.0Stargazers:5054Issues:31Issues:52

improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models

Language:PythonLicense:MITStargazers:3052Issues:124Issues:127

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language:PythonLicense:MITStargazers:1883Issues:40Issues:43

ai-audio-startups

Community list of startups working with AI in audio and music technology

Awesome-Implicit-NeRF-Robotics

A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites

AD-NeRF

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Language:PythonLicense:MITStargazers:1009Issues:16Issues:138

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonLicense:MITStargazers:871Issues:23Issues:32

IguanaTex

A PowerPoint add-in allowing you to insert LaTeX equations into PowerPoint presentations on Windows and Mac

Language:VBALicense:NOASSERTIONStargazers:808Issues:14Issues:67

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonLicense:NOASSERTIONStargazers:509Issues:34Issues:27

Speech-Resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

DFRF

[ECCV2022] The implementation for "Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis".

Language:PythonLicense:MITStargazers:335Issues:10Issues:37

sound-spaces

A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.

Language:PythonLicense:CC-BY-4.0Stargazers:328Issues:16Issues:138

BeatNet

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

Language:PythonLicense:CC-BY-4.0Stargazers:308Issues:9Issues:26

GRaNDPapA

Generator of Rad Names from Decent Paper Acronyms

Awesome-Speech-Pretraining

Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.

Language:PythonLicense:Apache-2.0Stargazers:167Issues:8Issues:7

pointGAN

point set generative adversarial nets

Language:PythonLicense:MITStargazers:145Issues:9Issues:8

BIRD

Big Impulse Response Dataset

Language:PythonLicense:GPL-3.0Stargazers:136Issues:9Issues:2

SeqDeepFake

[ECCV 2022] PyTorch code for SeqDeepFake: Detecting and Recovering Sequential DeepFake Manipulation

CVPR-2021-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)

Language:Jupyter NotebookStargazers:80Issues:4Issues:1

Skipping-The-Frame-Level

A simple yet effective Audio-to-Midi Automatic Piano Transcription system

Language:PythonLicense:MITStargazers:77Issues:7Issues:17

itsp

Introduction to Speech Processing

Language:Jupyter NotebookLicense:CC-BY-SA-4.0Stargazers:52Issues:4Issues:4

libmpeghe

MPEG-H 3D Audio Low Complexity Profile Encoder. Decoder: https://github.com/ittiam-systems/libmpegh

Language:CLicense:BSD-3-Clause-ClearStargazers:41Issues:4Issues:8

heterogeneous_separation

Code and data recipes for the paper: Heterogeneous Target Speech Separation

Language:PythonLicense:MITStargazers:38Issues:0Issues:0

LookForTheChange

Code for Look for the Change paper published at CVPR 2022

Language:PythonLicense:MITStargazers:35Issues:4Issues:6