yzyouzhang

You Zhang's repositories

AIR-ASVspoof

Official implementation of the SPL paper "One-class Learning Towards Synthetic Voice Spoofing Detection"

Language:PythonMIT92 3 31

ASVspoof2021_AIR

Official implementation of our ASVspoof 2021 paper, "UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021"

Language:PythonMIT47 4 13

Audio_Research_in_US

For students who would like to apply for RA, PhD, postdoc in audio research.

2000

hrtf_field

Official implementation of the ICASSP 2023 paper "HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields"

Language:PythonMIT20 20

SASV_PR

Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"

Language:PythonMIT13 20

Empirical-Channel-CM

Official Implementation of our Interspeech 2021 paper "An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems"

Language:PythonMIT11 3 3

CS61Bsp18-proj2-byog

Project BYoG for UCB course CS61B Data Structures Spring 2018

Language:Java700

HBAS_chapter_voice3

Official implementation of the handbook chapter "Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation"

Language:PythonMIT400

HRTF_field_norm

Official Implementation of our WASPAA 2023 paper "Mitigating Cross-Database Differences for Learning Unified HRTF Representation"

Language:PythonBSD-3-Clause100

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

000

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0000

INFO159-LHW4-Chatbot

A pytorch Chatbot for INFO159 Natural Language Processing

Language:Python020

PhaseAntispoofing_INTERSPEECH

Official repository of the Interspeech 2023 paper "Phase perturbation improves channel robustness for speech spoofing countermeasures"

Language:PythonMIT000

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

MIT000

espnet

End-to-End Speech Processing Toolkit

Apache-2.0000

FastDiff

PyTorch Implementation of FastDiff (IJCAI'22)

Language:Python000

flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer

Language:Jupyter NotebookApache-2.0000

mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

Language:Jupyter NotebookBSD-3-Clause000

Online-Recurrent-Extreme-Learning-Machine

Online-Recurrent-Extreme-Learning-Machine (OR-ELM) for time-series prediction, implemented in python

Language:Python010

samo

SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING

Language:PythonMIT000

serve

Serve, optimize and scale PyTorch models in production

Apache-2.0000

SingFake

Official Repository for "SingFake: Singing Voice Deepfake Detection"

Language:JavaScriptMIT000

SpeechEmotionAVLearning

000

SpeechTasks

This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.

000

words_spoken_daily

Language:PythonBSD-3-Clause000

yzyouzhang.github.io

Language:JavaScriptMIT000