Private latent spaces are not used

Question

Private latent spaces are not used

gdikov opened this issue 7 years ago · comments

@FHainzl @Elli1993 Private latent space doesn't exhibit any structure even after I fixed the dataset as discussed. In some cases even the shared space looks untrained.

Possible reasons are:

datasets are inadequate
plots are wrong
models are wrong
optimisation hyper-parameters are inadequate
...
the concept is wrong

Please do take a look at the possible points of failure and share your thoughts/ideas.

Georgi Dikov · Answer 1 · Mon Jul 10 2017 02:22:33 GMT+0800 (China Standard Time)

I just trained on MNIST variations and the model is able to overfit a small subset. There are clearly visible clusters both in private and shared latent spaces, however the digits are all mixed which is a bit suspicious. Also the reconstructions seem to produce rather plausible images which are not matching the target image. There could be an issue with the plots and/or shuffling during iteration which shouldn't happen.

Georgi Dikov · Answer 2 · Mon Jul 10 2017 04:50:50 GMT+0800 (China Standard Time)

@FHainzl @Elli1993 please have a look at the doc/results/conjoint_vae for what the training on plain MNIST and MNIST with background images looks like. The latent spaces are not putting digits with different labels on different places which hinders the reconstruction of course. Nevertheless the images look plausible for only ~160 epochs of learning.

So the issue now is that the latent spaces are used but not in the desired way. Please play around and train on subsets to overfit the data and explore what might be causing these effects.

elizamanelli · Answer 3 · Mon Jul 10 2017 23:25:31 GMT+0800 (China Standard Time)

Hello,
I just checked and after my overfit training (1000 epochs) it looked even worse. Reconstruction always produced zero - for all of the samples. Data samples looked good though. I'll check if I can find the error somewhere in there...

Georgi Dikov · Answer 4 · Tue Jul 11 2017 03:26:24 GMT+0800 (China Standard Time)

@Elli1993 Ok, thanks, please plot the loss curve and check whether there happens something strange. Check also what input is flowing into the loss layer and whether the output is correct. Producing good results probably means that latent factors capture well enough the data variability, and since data generations is ok, probably the KL divergence loss is low. However not being able to reproduce the digit should be penalised in the reconstruction loss. Check this one too, please. Also very plausible scenario is that the data is being (re)constructed according to different criteria (other than digit form). Maybe the background is so penalising the reconstructions that the form of the central digit is not amounting enough and good background is preferred. Could you check this hypothesis too?
Lastly, you can try to let it train on standard (black background) MNIST for both autoencoders and check whether the results are as expected: shared latent space fully utilised, good reconstructions and so on....

If these are too many things to do maybe @FHainzl can help you with that. In the mean time I am implementing the AVB conjoint stuff and a customized MNIST variations dataset, so that we have labels for background texture and plot the latent space according to them.

Have a nice bug hunt!

Georgi Dikov · Answer 5 · Sat Jul 15 2017 02:19:58 GMT+0800 (China Standard Time)

@Elli1993 @FHainzl I think I fixed it. Please rebase on the master branch and start a new training. I will train on the MNIST with a single dataset per encoder (horizontal vs. vertical) but you can try out the 2 datasets per encoder case, for the sake of diversity. In my experiment we expect to see non-clustered isotropic Gaussian for all classes (colour plotted with digit labels) in the privates and 10 regions given by the 10 digit labels in the shared. In yours there should be two clusters in each private, which if coloured with the digit labels should be homogeneous and if with the pattern tag -- separated. The shared space should again exhibit 10 different clusters for the 10 digit labels. Reconstruction and data generation should be close to perfect if trained long enough.

I will also train AVB with the same data as my first experiment to verify its correctness and then you can take over and train it with the 4 datasets (2 per autoencoder).

Please check out the project kanban board for further tasks to do and feel free to add some if you think that I have missed anything.