You Zhang (yzyouzhang)

yzyouzhang

Geek Repo

Company:University of Rochester

Location:NY, US

Home Page:https://yzyouzhang.com

Twitter:@yzyouzhang

Github PK Tool:Github PK Tool


Organizations
AirLabUR

You Zhang's starred repositories

best_AI_papers_2021

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

License:MITStargazers:2915Issues:84Issues:0

svox2

Plenoxels: Radiance Fields without Neural Networks

Language:PythonLicense:BSD-2-ClauseStargazers:2796Issues:48Issues:127

decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Language:PythonLicense:MITStargazers:2277Issues:30Issues:63

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonLicense:MITStargazers:1385Issues:44Issues:220

PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

Language:PythonLicense:Apache-2.0Stargazers:1208Issues:10Issues:109

viewer

ML models and internal tensors 3D visualizer

pytorch-AdaIN

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization' [Huang+, ICCV2017]

Language:PythonLicense:MITStargazers:1050Issues:8Issues:30

stickyland

Break the linear presentation of Jupyter Notebooks with sticky cells!

Language:TypeScriptLicense:BSD-3-ClauseStargazers:505Issues:9Issues:14

pysox

Python wrapper around sox.

Language:PythonLicense:BSD-3-ClauseStargazers:492Issues:12Issues:100

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Language:PythonLicense:NOASSERTIONStargazers:411Issues:21Issues:44

comparxiv

Compare two version of an arXiv preprint with a single command.

Language:PythonLicense:MITStargazers:349Issues:6Issues:11

3dti_AudioToolkit

3D Tune-In Toolkit is a custom open-source C++ library developed within the EU-funded project 3D Tune-In. The Toolkit provides a high level of realism and immersiveness within binaural 3D audio simulations, while allowing for the emulation of hearing aid devices and of different typologies of hearing loss.

Language:C++License:GPL-3.0Stargazers:210Issues:29Issues:23

Zero_Shot_Audio_Source_Separation

The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022

Language:PythonLicense:MITStargazers:177Issues:7Issues:19

LearnToPayAttention

PyTorch implementation of ICLR 2018 paper Learn To Pay Attention (and some modification)

Language:PythonLicense:GPL-3.0Stargazers:163Issues:3Issues:9

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Language:PythonLicense:AGPL-3.0Stargazers:145Issues:6Issues:3

pytorch-kaldi-neural-speaker-embeddings

A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.

Language:PerlLicense:BSD-3-ClauseStargazers:134Issues:8Issues:6
Language:JavaScriptLicense:MITStargazers:99Issues:6Issues:0

pytorch-pcen

PyTorch reimplementation of per-channel energy normalization for audio.

Language:PythonLicense:MITStargazers:92Issues:3Issues:4
Language:PythonLicense:MITStargazers:75Issues:6Issues:1

SpeechFormer

Official implement of SpeechFormer written in Python (PyTorch).

SASVC2022_Baseline

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

xR-EgoPose

New egocentric synthetic dataset for egocentric 3D human pose estimation

Language:PythonLicense:NOASSERTIONStargazers:57Issues:12Issues:16

DCA-PLDA

Discriminative Condition-Aware PLDA

Language:PythonLicense:NOASSERTIONStargazers:41Issues:5Issues:8

LASAFT-Net-v2

A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"

Language:PythonLicense:MITStargazers:33Issues:4Issues:3

revisiting-spatial-temporal-layouts

Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

SASV_PR

Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"

Language:PythonLicense:MITStargazers:14Issues:2Issues:0

probabilistic_embeddings

This repositoty [contains / will contain] Python code associated with our Oddyssey paper [put arXiv link here].

Language:PythonLicense:MITStargazers:8Issues:5Issues:0

X-MRS

Food image / recipe (text) cross-modal representation learning, retrieval and (image) synthesis. Code from ACM-Multimedia 2021 "Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning"

Language:PythonLicense:NOASSERTIONStargazers:7Issues:2Issues:2

PL-EESR

the code for paper of "PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction"

Language:PythonStargazers:4Issues:0Issues:0