fakufaku

Robin Scheibler's starred repositories

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT32823 348 295

spotify-downloader

Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).

Language:PythonMIT15329 187 1430

mlx

MLX: An array framework for Apple silicon

Language:C++MIT15092 137 435

pytube

A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.

Language:PythonUnlicense10491 193 1277

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonMIT6400 53 196

FriendsDontLetFriends

Friends don't let friends make certain types of data visualization - What are they and why are they bad.

Language:RMIT6119 102 7

marimo

A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.

Language:PythonApache-2.04984 25 372

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonApache-2.02631 72 79

beartype

Unbearably fast near-real-time hybrid runtime-static type-checking in pure Python.

Language:PythonMIT2473 15 308

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonMIT1654 27 211

ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache-2.01470 65 5

fpdf2

Simple PDF generation for Python

Language:PythonLGPL-3.0973 22 429

kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA

Language:Jupyter NotebookNOASSERTION778 30 103

melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Language:PythonBSD-3-Clause626 30 59

LanguageAgentTreeSearch

Official repository for ICML'24 paper "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"

Language:PythonMIT506 9 18

survey

A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf

355 13 1

tagainijisho

A free Japanese dictionary and learning assistant

Language:C++GPL-3.0350 30 216

ZeroSpeech

VQ-VAE for Acoustic Unit Discovery and Voice Conversion

Language:Python311 9 18

tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.

Language:Jupyter NotebookNOASSERTION305 16 21

PyTorch-Wavelet-Toolbox

Differentiable fast wavelet transforms in PyTorch with GPU support.

Language:PythonEUPL-1.2252 7 22

whisper-finetuning

[WIP] Scripts for fine-tuning Whisper

Language:PythonMIT195 7 19

neural-audio-fp

Language:PythonMIT169 7 38

SpeechMOS

Easy-to-Use Speech MOS predictors

Language:PythonMIT167 7 11

music_mixing_style_transfer

Language:PythonMIT149 4 3

Python_Simulations

Various Python Simulations

Language:Jupyter Notebook99 30

ml-spatial-librispeech

A large synthetic dataset of spatial audio with multiple labels

NOASSERTION75 180

HCL

Language:Python35 3 3

DPMTSE

A Diffusion Probabilistic Model for Target Sound Extraction

Language:Python2500

WER-CER

Calculator Tool of Word Error Rate and Character Error Rate

Language:PythonMIT9 20

self-remixing

Official implementation of Self-Remixing

Language:PythonMIT900