karpathy / deep-vector-quantization

VQVAEs, GumbelSoftmaxes and friends

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing 1x1 convolutions at the beginning of the decoder

CDitzel opened this issue · comments

I believe that there is at least one 1x1 conv missing. In the paper on p. 3 they mention the crucial importance of those but I could only find a projection here prior to the bottleneck.

self.proj = nn.Conv2d(num_hiddens, n_embed, 1)

As a side question: What is the reason that many Autoencoder architectures do away completely with normalization layers in both the encoder and the decoder? I tried to reseach this question but couldnt find a proper answer. Also does the size and complexity of both directly relate to the reconstruction quality? I have seen huge encoder/decoder structures which did not perform significantly better than the modest form you have in this repo or Phils simple architecture for that matter