Victor Costa Beraldo's starred repositories
speech-emotion-recogntion-ser2022
Speech Emotion Recognition SE&R 2022
voice-unlocker
Detecção e identificação de usuários atravéz da fala dentro de um contexto de extração de features de sinais de áudio nos domínios do tempo, frequência e tempo-frequência.
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
NeuralPlda
Implementation of Neural PLDA (NPLDA) model (A discriminative backend for Speaker Verification)
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
SpeakerEmbeddingLossComparison
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020
You-Only-Speak-Once
Deep Learning - one shot learning for speaker recognition using Filter Banks
PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
AM-SincNet
The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architecture SincNet and the additive margin softmax (AM-Softmax) loss function. It uses the architecture of the SincNet, but with an improved AM-Softmax layer.
Resemblyzer
A python package to analyze and compare voices with deep learning
banknoteBrazil
Classification of Brazilian paper money.
repo2docker
Turn repositories into Jupyter-enabled Docker images
jupyterlab-vim
:neckbeard: Vim notebook cell bindings for JupyterLab
awesome-python-scientific-audio
Curated list of python software and packages related to scientific research in audio
VGG-Speaker-Recognition
Utterance-level Aggregation For Speaker Recognition In The Wild
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
machine-learning
:earth_americas: machine learning tutorials (mainly in Python3)