zyser's repositories

eben

Repo for source code of EBEN: Extreme Bandwidth Extension Network

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

golf

A DDSP-based neural vocoder.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

annotated_deep_learning_paper_implementations

๐Ÿง‘โ€๐Ÿซ 59 Implementations/tutorials of deep learning papers with side-by-side notes ๐Ÿ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), ๐ŸŽฎ reinforcement learning (ppo, dqn), capsnet, distillation, ... ๐Ÿง 

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

basic-pitch

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

BigVGAN-NVIDIA

Official implementation of BigVGAN in PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

diffsptk

A differential version of SPTK

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

GeneFace

Official Pytorch Implementation of GeneFace (ICLR 2023)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

LAVISH

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Language:PythonStargazers:0Issues:0Issues:0

LlamaVoice

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Language:PythonStargazers:0Issues:0Issues:0

log-wmse-audio-quality

logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even when there are many audio tracks or stems.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MagVITS

VITS with phoneme-level prosody modeling based on MaskGIT

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Matcha-TTS

๐Ÿต Matcha-TTS: A fast TTS architecture with conditional flow matching

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

nano-llama31

nanoGPT style version of Llama 3.1

Language:PythonStargazers:0Issues:0Issues:0
Language:TeXLicense:Apache-2.0Stargazers:0Issues:0Issues:0

normalizing-flows

PyTorch implementation of normalizing flow models

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

PeriodWave

The official Implementation of PeriodWave and PeriodWave-Turbo

License:MITStargazers:0Issues:0Issues:0

pipecat_framework-for-voice-and-multimodal-conversational-AI

Open Source framework for voice and multimodal conversational AI

License:BSD-2-ClauseStargazers:0Issues:0Issues:0

podcast-summarizer

OpenAI Whisper + davinci for podcast summarization

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

polymath_music_separation

Convert any music library into a music production sample-library with ML

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

praat

Praat: Doing Phonetics By Computer

Language:CStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

taylor-series-linear-attention

Explorations into the recently proposed Taylor Series Linear Attention

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

TriAAN-VC

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

voice_datasets

๐Ÿ”Š A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

Stargazers:0Issues:0Issues:0

xlstm_fork

Pytorch implementation of the xLSTM model by Beck et al. (2024)

Language:PythonStargazers:0Issues:0Issues:0

yt-dlp

A youtube-dl fork with additional features and fixes

Language:PythonLicense:UnlicenseStargazers:0Issues:0Issues:0