Zexin Cai (caizexin)

caizexin

Geek Repo

Company:Duke University

Home Page:https://caizexin.github.io

Github PK Tool:Github PK Tool

Zexin Cai's starred repositories

DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes.

Language:PythonLicense:GPL-3.0Stargazers:46635Issues:1134Issues:1340

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:36000Issues:331Issues:441

al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLLicense:MITStargazers:11124Issues:23Issues:571

TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Language:Jupyter NotebookLicense:MPL-2.0Stargazers:9379Issues:186Issues:565

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8883Issues:135Issues:1100

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Language:PythonLicense:MITStargazers:7658Issues:82Issues:153

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:Jupyter NotebookLicense:MITStargazers:7384Issues:78Issues:189

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Language:PythonLicense:NOASSERTIONStargazers:1029Issues:35Issues:62

pykaldi

A Python wrapper for Kaldi

Language:PythonLicense:Apache-2.0Stargazers:999Issues:42Issues:277

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Language:PythonLicense:MITStargazers:829Issues:31Issues:79

Summary2023

2023年精选开源项目汇总,分类汇总

soft-vc

Soft speech units for voice conversion

Language:Jupyter NotebookLicense:MITStargazers:410Issues:12Issues:14

SSL_Anti-spoofing

This repository includes the code to reproduce our paper "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation".

Language:PythonLicense:MITStargazers:105Issues:5Issues:5

Synthetic-Voice-Detection-Vocoder-Artifacts

This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.

Language:PythonLicense:MITStargazers:94Issues:8Issues:14

tf_multispeakerTTS_fc

the Tensorflow version of multi-speaker TTS training with feedback constraint

Language:PythonLicense:MITStargazers:40Issues:3Issues:5

hpo_nmt

Datasets for Hyperparameter Optimization of Neural Machine Translation

Language:PythonLicense:MITStargazers:9Issues:3Issues:1
Language:PythonLicense:MITStargazers:7Issues:1Issues:0

gbopt

The graph-based optimization.

Language:PythonLicense:MITStargazers:2Issues:2Issues:0

slr_handshape

Handshape-aware sign language recognition.

Language:PythonLicense:MITStargazers:2Issues:1Issues:0