João Felipe Santos's repositories
ataritools
Tools to convert text files from ASCII to ATASCII
jfsantos.github.io
My research blog
alias-free-torch
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample
altium-projects
Altium PCBs for guitar effects pedals
audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
cargan
Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"
DaisyExamples
Examples for the Daisy Platform
DaisySP
A Powerful, Open Source DSP Library in C++
ddim
Denoising Diffusion Implicit Models
DeepAFx
Third-party audio effects plugins as differentiable layers within deep neural networks.
denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
diffsptk
A differential version of SPTK
HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
ltspice-guitar-pedals
A collection of LTSpice simulation files for popular guitar effects. :guitar: :electron: :musical_note: :chart_with_upwards_trend: Pull requests welcome :smiley:
lyrebird-wav2clip
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
NeMo
Neural Modules: a toolkit for conversational AI
NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
open_flamingo
An open-source framework for training large multimodal models
ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
phaseaug
Submitted to ICASSP 2023
sample-generator
Tools to train a generative model on arbitrary audio samples
state-spaces
Sequence Modeling with Structured State Spaces
terrarium-stand
A template repository for creating effects with the terrarium from PedalPCB.
univnet
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
uxnds
NDS port of the uxn virtual machine
visqol
Perceptual Quality Estimator for speech and audio