0417itsuki's repositories
VALL-E-X-Trainer-by-CustomData
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
JEN-1-pytorch
Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.04729)
PromptTTS2
[WIP] Unofficial Implementation of Microsoft's PromptTTS2
JEN-1-COMPOSER-pytorch
Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.19180)
SpeechTokenizer_trainer
Trainer of Speech Tokenizer(https://arxiv.org/abs/2308.16692)
music_dataset_generator
This repo is a necessary style prompt for generating music with accompaniment, and for transcribing lyrics.
Control-JBDiff
[WIP] ControlNet for Jukebox-diffusion
all-in-one
All-In-One Music Structure Analyzer
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
llark
Code for the paper "LLark: A Multimodal Foundation Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.
naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
soundstorm-speechtokenizer
Implementation of SoundStorm built upon SpeechTokenizer.
tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
vampnet
music generation with masked transformers!