smkim7-kr / latent-is-all-you-need

Image, text, sound... they are just latent vectors aren't they?

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

latent-is-all-you-need

Inspired from @hwalsuklee github repository and video, this presents the investigation/reimplementation of VAE/CVAE with different latent space size.

Dataset

  • MNIST
  • CIFAR10 (#TODO)

Framework

  • Pytorch + wandb

Results

Input latent_dim=2 latent_dim=5 latent_dim=10 latent_dim=20
VAE input 2d 5d 10d 20d
CAVE input 2d 5d 10d 20d
Manifold / Conditional Generation Distribution
VAE vaewalk manifold
CVAE cond_style manifold
Video VAE CVAE
Manifold manifold manifold

Below result shows CVAE result of fixed latent space vector z, but with different condition (0-9 for MNIST)

vaewalk_0 vaewalk_1
vaewalk_2 vaewalk_3
vaewalk_4 vaewalk_5
vaewalk_6 vaewalk_7
vaewalk_8 vaewalk_9

About

Image, text, sound... they are just latent vectors aren't they?


Languages

Language:Jupyter Notebook 100.0%