NianZhaoxu / genmusic_demo_list

a list of demo websites for automatic music generation research

A list of demo websites for automatic music generation research

audio-domain singing synthesis

HiddenSinger (diffusion; hwang23arxiv): https://jisang93.github.io/hiddensinger-demo/
Make-A-Voice (transformer; huang23arxiv): https://make-a-voice.github.io/
RMSSinger (diffusion; he23aclf): https://rmssinger.github.io/
NaturalSpeech 2 (diffusion; shen23arxiv): https://speechresearch.github.io/naturalspeech2/
NANSY++ (Transformer; choi23iclr): https://bald-lifeboat-9af.notion.site/Demo-Page-For-NANSY-67d92406f62b4630906282117c7f0c39
UniSyn (; lei23aaai): https://leiyi420.github.io/UniSyn/
VISinger 2 (zhang22arxiv): https://zhangyongmao.github.io/VISinger2/
xiaoicesing 2 (Transformer+GAN; wang22arxiv): https://wavelandspeech.github.io/xiaoice2/
WeSinger 2 (Transformer+GAN; zhang22arxiv): https://zzw922cn.github.io/wesinger2/
U-Singer (Transformer; kim22arxiv): https://u-singer.github.io/
Singing-Tacotron (Transformer; wang22arxiv): https://hairuo55.github.io/SingingTacotron/
KaraSinger (GRU/Transformer; liao22icassp): https://jerrygood0703.github.io/KaraSinger/
VISinger (flow; zhang2): https://zhangyongmao.github.io/VISinger/
MLP singer (mixer blocks; tae21arxiv): https://github.com/neosapience/mlp-singer
LiteSing (wavenet; zhuang21icassp): https://auzxb.github.io/LiteSing/
DiffSinger (diffusion; liu22aaai)[no duration modeling]: https://diffsinger.github.io/
HiFiSinger (Transformer; chen20arxiv): https://speechresearch.github.io/hifisinger/
DeepSinger (Transformer; ren20kdd): https://speechresearch.github.io/deepsinger/
xiaoice-multi-singer: https://jiewu-demo.github.io/INTERSPEECH2020/
xiaoicesing: https://xiaoicesing.github.io/
bytesing: https://bytesings.github.io/
mellotron: https://nv-adlr.github.io/Mellotron
lee's model (lee19arxiv): http://ksinging.mystrikingly.com/
http://home.ustc.edu.cn/~yiyh/interspeech2019/

text-to-music/audio

CLIPSynth (diffusion; dong23cvprw): https://salu133445.github.io/clipsynth/
CLIPSonic (diffusion; dong23waspaa): https://salu133445.github.io/clipsonic/
MusicGen (Transformer; copet23arxiv): https://ai.honu.io/papers/musicgen/
MuseCoco (Transformer; lu23arxiv): https://ai-muzic.github.io/musecoco/ (for symbolic music)
MeLoDy (Transformer+diffusion; lam23arxiv): https://efficient-melody.github.io/
SoundStorm (Transformer; borsos23arxiv): https://google-research.github.io/seanet/soundstorm/examples/ (for general sounds)
MusicLM (Transformer; agostinelli23arxiv): https://google-research.github.io/seanet/musiclm/examples/
VALL-E (Transformer; wang23arxiv): https://www.microsoft.com/en-us/research/project/vall-e/ (for speech)
multi-source-diffusion-models (diffusion; 23arxiv): https://gladia-research-group.github.io/multi-source-diffusion-models/
Noise2Music (diffusion; huang23arxiv): https://noise2music.github.io/
ERNIE-Music (diffusion; zhu23arxiv): N/A
Riffusion (diffusion;): https://www.riffusion.com/
Make-An-Audio (diffusion; huang23arxiv): https://text-to-audio.github.io/ (for general sounds)
AudioLDM (diffusion; liu23arxiv): https://audioldm.github.io/ (for general sounds)
AudioLM (Transformer; borsos22arxiv): https://google-research.github.io/seanet/audiolm/examples/ (for general sounds)

audio-domain music generation

VampNet (transformer; garcia23ismir): https://hugo-does-things.notion.site/VampNet-Music-Generation-via-Masked-Acoustic-Token-Modeling-e37aabd0d5f1493aa42c5711d0764b33
fast JukeBox (jukebox+knowledge distilling; pezzat-morales23mdpi): https://soundcloud.com/michel-pezzat-615988723
DAG (diffusion; pascual23icassp): https://diffusionaudiosynthesis.github.io/
musika! (GAN; pasini22ismir): https://huggingface.co/spaces/marcop/musika
JukeNox (VQVAE+Transformer; dhariwal20arxiv): https://openai.com/blog/jukebox/
UNAGAN (GAN; liu20arxiv): https://github.com/ciaua/unagan
dadabots (sampleRNN; carr18mume): http://dadabots.com/music.php

given singing, generate accompaniments

SingSong (VQVAE+Transofmrer; donahue23arxiv): https://storage.googleapis.com/sing-song/index.html

given drumless audio, generate drum accompaniments

JukeDrummer (VQVAE+Transofmrer; wu22ismir): https://legoodmanner.github.io/jukedrummer-demo/

audio-domain singing style transfer / singing voice conversion

SoftVC VITS (): https://github.com/svc-develop-team/so-vits-svc
Assem-VC (; kim21nipsw): https://mindslab-ai.github.io/assem-vc/singer/
iZotope-SVC (conv-encoder/decoder; nercessian20ismir): https://sites.google.com/izotope.com/ismir2020-audio-demo
VAW-GAN (GAN; lu20arxiv): https://kunzhou9646.github.io/singvaw-gan/
polyak20interspeech (GAN; polyak20interspeech): https://singing-conversion.github.io/
SINGAN (GAN; sisman19apsipa): N/A
[MSVC-GAN] (GAN): https://hujinsen.github.io/
https://mtg.github.io/singing-synthesis-demos/voice-cloning/
https://enk100.github.io/Unsupervised_Singing_Voice_Conversion/
Yong&Nam (DSP; yong18icassp): https://seyong92.github.io/singing-expression-transfer/
cybegan (CNN+GAN; wu18faim): http://mirlab.org/users/haley.wu/cybegan/

audio-domain speech-to-singing conversion

AlignSTS (encoder/adaptor/aligner/diff-decoder; li23facl): https://alignsts.github.io/
speech2sing2 (GAN; wu20interspeech): https://ericwudayi.github.io/Speech2Singing-DEMO/
speech2sing (encoder/decoder; parekh20icassp): https://jayneelparekh.github.io/icassp20/

audio-domain singing correction

deep-autotuner (CGRU; wagner19icassp): http://homes.sice.indiana.edu/scwager/deepautotuner.html

audio-domain style transfer (general)

VQ-VAE (VQ-VAE; cifka21icassp): https://adasp.telecom-paris.fr/rc/demos_companion-pages/cifka-ss-vq-vae/
MelGAN-VC (GAN; pasini19arxiv): https://www.youtube.com/watch?v=3BN577LK62Y&feature=youtu.be
RaGAN (GAN; lu19aaai): https://github.com/ChienYuLu/Play-As-You-Like-Timbre-Enhanced-Multi-modal-Music-Style-Transfer
TimbreTron (GAN; huang19iclr): https://www.cs.toronto.edu/~huang/TimbreTron/samples_page.html
string2woodwind (DSP; wagner17icassp): http://homes.sice.indiana.edu/scwager/css.html

TTS

VITS (transformer+flow+GAN; kim21icml): https://github.com/jaywalnut310/vits

vocoder

GOLF (DDSP; yu23ismir): https://yoyololicon.github.io/golf-demo/
BigVGAN (GAN; lee23iclr): https://bigvgan-demo.github.io/
SawSing (DDSP; wu22ismir): https://ddspvocoder.github.io/ismir-demo/
Multi-Singer (wavenet; huang21mm): https://multi-singer.github.io/
SingGAN (GAN; chen21arxiv): https://singgan.github.io/
DiffWave (diffusion; kong21iclr): https://diffwave-demo.github.io/
MelGAN (GAN; kumar19neurips): https://melgan-neurips.github.io/

audio tokenzier

Improved RVQGAN (VQ; kumar23arxiv): https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5
HiFi-Codec (VQ; yang23arxiv): https://github.com/yangdongchao/AcademiCodec
EnCodec (VQ; défossez22arxiv): https://github.com/facebookresearch/encodec
SoundStream (VQ; zeghidour21arxiv): https://google-research.github.io/seanet/soundstream/examples/

About

a list of demo websites for automatic music generation research