Beast code in Giters

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:Jupyter NotebookMIT7384 78 189

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Language:PythonNOASSERTION1029 35 62

pykaldi

A Python wrapper for Kaldi

Language:PythonApache-2.0999 42 277

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Language:PythonMIT829 31 79

Summary2023

2023年精选开源项目汇总,分类汇总

418 3 1

soft-vc

Soft speech units for voice conversion

Language:Jupyter NotebookMIT410 12 14

SSL_Anti-spoofing

This repository includes the code to reproduce our paper "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation".

Language:PythonMIT105 5 5

Synthetic-Voice-Detection-Vocoder-Artifacts

This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.

Language:PythonMIT94 8 14

tf_multispeakerTTS_fc

the Tensorflow version of multi-speaker TTS training with feedback constraint

Language:PythonMIT40 3 5

hpo_nmt

Datasets for Hyperparameter Optimization of Neural Machine Translation

Language:PythonMIT9 3 1

speechbrain_PartialFake

Language:PythonMIT7 10

gbopt

The graph-based optimization.

Language:PythonMIT2 20

slr_handshape

Handshape-aware sign language recognition.

Language:PythonMIT2 10