Jing-Xuan Zhang (jxzhanggg)

jxzhanggg

Geek Repo

Company:Shaanxi Normal University

Location:Xi'an

Github PK Tool:Github PK Tool

Jing-Xuan Zhang's starred repositories

ctc_segmentation

Segment a given audio into utterances using a trained end-to-end ASR model.

Language:PythonLicense:Apache-2.0Stargazers:73Issues:0Issues:0

AV-RelScore

Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23

Language:PythonStargazers:28Issues:0Issues:0

Visual_Speech_Recognition_for_Multiple_Languages

Visual Speech Recognition for Multiple Languages

Language:PythonLicense:NOASSERTIONStargazers:311Issues:0Issues:0

kenlm

KenLM: Faster and Smaller Language Model Queries

Language:C++License:NOASSERTIONStargazers:2459Issues:0Issues:0

Semi-supervised-learning

A Unified Semi-Supervised Learning Codebase (NeurIPS'22)

Language:PythonLicense:MITStargazers:1281Issues:0Issues:0

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:10721Issues:0Issues:0

av_hubert

A self-supervised learning framework for audio-visual speech

Language:PythonLicense:NOASSERTIONStargazers:819Issues:0Issues:0

hydra

Hydra is a framework for elegantly configuring complex applications

Language:PythonLicense:MITStargazers:8469Issues:0Issues:0

nonparaSeq2seqVC_code

Implementation code of non-parallel sequence-to-sequence VC

Language:PythonLicense:MITStargazers:247Issues:0Issues:0

beaqlejs

*BeaqleJS* provides a framework to create browser based listening tests and is purely based on open web standards like HTML5 and Javascript.

Language:JavaScriptLicense:GPL-3.0Stargazers:86Issues:0Issues:0

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:51705Issues:0Issues:0

Lip2Wav

This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"

Language:PythonLicense:MITStargazers:692Issues:0Issues:0

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Language:Jupyter NotebookLicense:MITStargazers:1530Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8175Issues:0Issues:0

ultrasuite-tools

Tools to process the UltraSuite data

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:11Issues:0Issues:0

LipNet-PyTorch

The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)

Language:PythonStargazers:206Issues:0Issues:0

cluster-scripts

A collection of useful scripts, templates, and examples for clusters using SLURM https://slurm.schedmd.com/

Language:ShellStargazers:96Issues:0Issues:0

955.WLB

955 不加班的公司名单 - 工作 955,work–life balance (工作与生活的平衡)

Stargazers:34484Issues:0Issues:0

996.ICU

Repo for counting stars and contributing. Press F to pay respect to glorious developers.

License:NOASSERTIONStargazers:269558Issues:0Issues:0