mrpep

Leonardo Pepino's starred repositories

VisionMamba

Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images

Language:PythonMIT29200

qedr

Quantitative evaluation of disentangled representations

Language:Jupyter NotebookMIT5900

conformal-predictions-from-scratch

Various Conformal Prediction methods implemented from scratch in pure NumPy for an educational purpose.

Language:Jupyter Notebook17300

frechet-audio-distance

A lightweight library for Frechet Audio Distance calculation.

Language:PythonMIT21200

mamba

Mamba SSM architecture

Language:PythonApache-2.01130400

transformer-contributions

Measuring the Mixing of Contextual Information in the Transformer

Language:Jupyter NotebookApache-2.02300

Fermat-distance

We propose a density-based estimator for weighted geodesic distances suitable for data lying on a manifold of lower dimension than ambient space and sampled from a possibly nonuniform distribution

Language:Python1400

vulkan

The ultimate Python binding for Vulkan API

Language:C++Apache-2.048900

wir2wav

a simple tool for the conversion of .wir impulse response files into standard PCM .wav files

Language:PythonMIT3800

plla-tisvs

Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation

Language:PythonMIT2100

Lyrics-to-Audio-Alignment

Aligns text (lyrics) with monophonic singing voice (audio). The algorithm uses structural segmentation to segment the audio into structures and then uses hidden markov models to obtain alignment within segments. The final alignment is concatenation of time stamps of lyrics within the segments for each song.

Language:Python8500

msaf

Music Structure Analysis Framework

Language:PythonMIT47700

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1168800

IPET

Pytorch implementation of INTEGRATED PARAMETER-EFFICIENT TUNING FOR GENERAL-PURPOSE AUDIO MODELS

Language:PythonMIT1000

m2d

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Language:Jupyter NotebookNOASSERTION5300

TorchPQ

Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda

Language:CudaMIT20400

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03825300

libri-light

dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.

Language:PythonMIT45900

layerwise-analysis

Layer-wise analysis of self-supervised pre-trained speech representations

Language:Python8200

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT328500

xmanager

A platform for managing machine learning experiments

Language:PythonApache-2.080300

ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language:PythonBSD-3-Clause35100

rockstar

The Rockstar programming language specification

Language:JavaScriptMIT686800

pytorch-dann

A PyTorch implementation for Unsupervised Domain Adaptation by Backpropagation

Language:Jupyter NotebookMIT14300

listening-test

An open source platform for browser based speech and audio subjective quality tests.

Language:TypeScriptMIT3000

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonApache-2.0236500

google-research

Google Research

Language:Jupyter NotebookApache-2.03328800

FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Language:PythonMIT51500

NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Language:PythonMIT61100

rotary-embedding-torch

Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

Language:PythonMIT45300