Pan Zexu (zexupan)

zexupan

Geek Repo

Company:National University of Singapore

Location:Singapore

Github PK Tool:Github PK Tool

Pan Zexu's starred repositories

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:32235Issues:273Issues:1067

TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Language:Jupyter NotebookLicense:MPL-2.0Stargazers:9079Issues:186Issues:559

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8162Issues:179Issues:2335

Conference-Acceptance-Rate

Acceptance rates for the major AI conferences

Language:Jupyter NotebookLicense:MITStargazers:4016Issues:126Issues:28

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonLicense:MITStargazers:1715Issues:27Issues:211

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Language:Jupyter NotebookLicense:MITStargazers:1529Issues:45Issues:252

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1077Issues:18Issues:131

Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

FastSpeech

The Implementation of FastSpeech based on pytorch.

Language:PythonLicense:MITStargazers:849Issues:34Issues:96

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

Contrastive-Predictive-Coding-PyTorch

Contrastive Predictive Coding for Automatic Speaker Verification

Language:PythonLicense:MITStargazers:474Issues:5Issues:21

nara_wpe

Different implementations of "Weighted Prediction Error" for speech dereverberation

Language:PythonLicense:MITStargazers:469Issues:18Issues:37

pystoi

Python implementation of the Short Term Objective Intelligibility measure

Language:MATLABLicense:MITStargazers:316Issues:12Issues:19

PaSST

Efficient Training of Audio Transformers with Patchout

Language:PythonLicense:Apache-2.0Stargazers:288Issues:4Issues:46

TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Language:PythonLicense:MITStargazers:282Issues:7Issues:66

Waveformer

A deep neural network architecture for low-latency audio processing

Language:PythonLicense:MITStargazers:277Issues:6Issues:4

dscore

Diarization scoring tools.

Language:PythonLicense:BSD-2-ClauseStargazers:205Issues:8Issues:4

speaker_extraction

target speaker extraction and verification for multi-talker speech

Language:PythonLicense:GPL-3.0Stargazers:153Issues:8Issues:5

youtube-gesture-dataset

This repository contains scripts to build Youtube Gesture Dataset.

Language:PythonLicense:BSD-3-ClauseStargazers:112Issues:4Issues:9

cocktail-fork-separation

Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset

Language:PythonLicense:MITStargazers:71Issues:4Issues:2

FlatTrajectoryDistillation_FTD

The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)

Language:PythonStargazers:17Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0
Language:PythonStargazers:3Issues:1Issues:0

EE4208ComputerVision

Face Detection

Language:PythonStargazers:2Issues:0Issues:0