mrpep

Leonardo Pepino's starred repositories

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.030621 312 885

ffmpeg-python

Python bindings for FFmpeg - with complex filtering support

Language:PythonApache-2.09641 114 697

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonMIT7883 150 530

pedalboard

🎛 🔊 A Python library for audio.

Language:C++GPL-3.04970 58 172

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonMIT4345 52 199

OpenPrompt

An Open-Source Framework for Prompt-Learning.

Language:PythonApache-2.04231 42 254

textdistance

📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

Language:PythonMIT3326 650

promptsource

Toolkit for creating, sharing and using natural language prompts.

Language:PythonApache-2.02577 31 162

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonMIT1825 32 160

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonMIT1685 27 211

omnizart

Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.

Language:PythonMIT1593 25 75

ESC-50

ESC-50: Dataset for Environmental Sound Classification

Language:PythonNOASSERTION1302 31 11

RAVE

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Language:PythonNOASSERTION1244 41 165

performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch

Language:PythonMIT1067 17 84

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookBSD-3-Clause1056 18 131

cleanthesis

Clean Thesis is a clean, simple, and elegant LaTeX style (or template) for thesis documents.

Language:TeX895 15 112

auraloss

Collection of audio-focused loss functions in PyTorch

Language:PythonApache-2.0685 18 35

convit

Code for the Convolutional Vision Transformer (ConViT)

Language:PythonApache-2.0456 17 19

GST-Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Language:PythonMIT353 14 17

lyrebird-wav2clip

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Language:PythonMIT318 11 13

soundata

Python library for downloading, loading & working with sound datasets

Language:PythonBSD-3-Clause289 10 75

hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Language:PythonApache-2.0197 10 10

cargan

Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"

Language:PythonMIT182 22 14

Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Language:PythonNOASSERTION181 5 7

ltspice-guitar-pedals

A collection of LTSpice simulation files for popular guitar effects. :guitar: :electron: :musical_note: :chart_with_upwards_trend: Pull requests welcome :smiley:

Language:AGS Script105 90

PyTorch-Raspberry-Pi-64-OS

PyTorch installation wheels for Raspberry Pi 64 OS

100 4 7

simple-speaker-embedding

A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.

Language:Jupyter NotebookNOASSERTION79 2 6

spiceAmp

Non-realtime high realistic software guitar processor. Works with *.wav files as input and output. It uses ngspice for electric circuit simulation and FFT convolver with Impulse Response *.wav file for cabinet simulation.

Language:C++GPL-3.030 50

faseAlign

Command line tool for forced-alignment of Spanish speech data

Language:PythonMIT12 5 6

CLEAR-dataset-generation

Generation code for the CLEAR dataset (Compositional Language and Elementary Acoustic Reasoning)

Language:PythonNOASSERTION300