Beast code in Giters

taalua's repositories

SSL_Anti-spoofing

This repository includes the code to reproduce our paper "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation".

MIT000

voicesmith

[WIP] VoiceSmith makes training text to speech models easy.

Apache-2.0000

Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

NOASSERTION000

FastSpeech2

Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:

000

jax-variational-diffwave

Jax/Flax implementation of Variational-DiffWave.

MIT000

taalua

Config files for my GitHub profile.

000

g2p

g2p: English Grapheme To Phoneme Conversion

Apache-2.0000

Neural-HMM

Neural HMMs are all you need (for high-quality attention-free TTS)

MIT000

normalizing-flows

PyTorch implementation of normalizing flow models

MIT000

clpcnet

Pitch-shifting, time-stretching, and vocoding of speech with Controllable LPCNet (CLPCNet)

NOASSERTION000

few-shot-transformer-tts

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

MIT000

mir-svc

Unsupervised WaveNet-based Singing Voice Conversion Using Pitch Augmentation and Two-phase Approach

NOASSERTION000

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

MIT000

flowEQ

β-VAE for intelligent control of a five band parametric EQ

BSD-3-Clause000

FG-transformer-TTS

Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.

MIT000

WaveGrad

Implementation of Google Brain's WaveGrad vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Language:Jupyter NotebookBSD-3-Clause000

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

NOASSERTION000

MTL-Speaker-Embeddings

Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" presented at Interspeech 2021

MIT000

inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

MIT000

x-vector-pytorch

000

tt-vae-gan

Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

000

voice_conversion

MIT000

stereoEEG2speech

Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.

000

ssqueezepy

Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python

MIT000

flow_synthesizer

Universal audio synthesizer control learning with normalizing flows

MIT000

wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.

MIT000

msaf

Music Structure Analysis Framework

MIT000

MaskCycleGAN-VC

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

MIT000

mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

GPL-3.0000

AudioStyleNet

This repository contains the code for my master thesis on Emotion-Aware Facial Animation

000