amanteur

Amantur Amatov's starred repositories

whisper-medusa

Whisper with Medusa heads

Language:PythonMIT45500

RustPython

A Python Interpreter written in Rust

Language:RustMIT1843500

mira

MiRA (Music Replication Assessment) tool is a model-independent open evaluation method based on four diverse audio music similarity metrics to assess exact data replication of the training set.

Language:PythonAGPL-3.02000

stemgen

Examples for ICASSP2024 paper "StemGen: A music generation model that listens"

MIT3300

SepReformer

Official repository of SepReformer for speech separation

Language:Python5100

LibriSpace

Language:PythonNOASSERTION12500

awesome-music

Awesome Music Projects

179500

mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Language:PythonMIT84100

project-NN-Pytorch-scripts

see README

Language:PythonBSD-3-Clause31700

Dasheng

Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"

Language:PythonApache-2.02200

CoMoSVC

CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone

Language:PythonMIT11700

DJCM

Language:PythonApache-2.01700

polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Language:RustNOASSERTION2843200

bandit-v2

Reimplementation of Bandit for "Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support"

Language:PythonApache-2.01400

CoverHunter

Official PyTorch implementation of CoverHunter

Language:Python2300

whisper-finetune

Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.

Language:PythonMIT20500

query-bandit

Banquet: A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

Language:Jupyter NotebookMIT2000

hearinganythinganywhere

Hearing Anything Anywhere Code Release

Language:Jupyter Notebook1900

encodecmae

Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'

Language:Python7800

SemantiCodec-inference

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Language:PythonMIT9600

streamlit-audio-recorder

Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)

Language:TypeScriptMIT40100

ssamba

The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model

Language:PythonBSD-3-Clause8500

Audio-Mamba-AuM

Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"

Language:Python7200

soundata

Python library for downloading, loading & working with sound datasets

Language:PythonBSD-3-Clause30700

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.0835100

instruct-MusicGen

The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning".

Language:PythonApache-2.05400

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0152500

MusicGPT

Generate music based on natural language prompts using LLMs running locally

Language:RustMIT53300

images-that-sound

Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions

Language:PythonMIT20200

ThunderKittens

Tile primitives for speedy kernels

Language:CudaMIT143100