Guim3 / IcGAN

Invertible conditional GANs for image editing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I am trying to re-implement mnist generation

chengdazhi opened this issue · comments

Hi Guim, thank you for this enlightening model. I am trying to re-implement the mnist generation image in your paper, which fixed latent code Z and modify label Y to generate different hand-written numbers.

I can see that you haven't provided instructions in README, nor could I find any pre-trained models. So right now I am using 'reconstructionWithVariations.lua' with slight modifications to support mnist. Also, I am using the generator of epoch 25 and encoder of epoch 15 for both Z and Y, trained exactly as instructed by README.

So could your please tell me if you used the code from other files? And if not, which pre-trained models did you use? Thanks a lot!

commented

Hi Dominic, thanks for using my code.
I don't have pre-trained MNIST models of the last model version, but I can provide you with the instructions to do so.
Just to make sure we are on the same page, have you tried to follow these instructions? By following them, you should get:

  1. A conditional generator of MNIST samples. That is, you can generate MNIST samples from scratch.
  2. An encoder Z to map a real MNIST image into a latent vector z.
  3. An encoder Y to map a real MNIST image into a conditional vector y.

Once you have completed these 3 steps, you need to go to cfg/generateConfigu.lua and in "-- Commom parameters for options 1 to 3" assign decNet to the conditional generator path, encZnet to encoder Z path and encYnet to encoder Y path. Then, you should be able to generate images with variations with reconstructionWithVariations.lua.

Let me know if you have any problems following these steps and I'll help you.

Thank you for your reply, Guim. That's exactly what I did, so gladly, we are on the same page. However, the image produced by reconstructionWithVariations.lua is very blurry, quite far from the image in your paper. Please note that I used the generator trained after 25 epochs and encoders for both Z and Y trained after 15 epochs, which are the default settings.

Image reconstructed with variations:
reconWIthVary

commented

There's definitely something wrong with the MNIST encoder training.
I have digging into my files and I've found an old MNIST encoder that gives good generation results.
It isn't the baseline encoder (encoder Z and encoder Y), but a single encoder that outputs Z and Y. It gives very similar results to the baseline encoder.
You can download it here.

Now, to use reconstructWithVariations.lua with this model, you need to change this part of the code:

-- Load nets
local generator = torch.load(opt.decNet)
local encZ = torch.load(opt.encZnet)
--local encY = torch.load(opt.encYnet)

-- Load to GPU
if opt.gpu > 0 then
    cudnn.convert(generator, cudnn)
    cudnn.convert(encZ, cudnn)
    --cudnn.convert(encY, cudnn)
    generator:cuda(); encZ:cuda(); --encY:cuda()
else
    generator:float(); encZ:float(); --encY:cuda()
end

generator:evaluate()
encZ:evaluate()
--encY:evaluate()

local inputX = torch.Tensor(opt.nImages, opt.loadSize[1], opt.loadSize[2], opt.loadSize[3])

-- Encode Z and Y from a given set of images
inputX = obtainImageSet(inputX, opt.loadPath, opt.loadOption, imgExtension)
if opt.gpu > 0 then inputX = inputX:cuda() end


local tmp = encZ:forward(inputX)
--local Y = encY:forward(inputX)
local Z = tmp[1]
local Y = tmp[2]

And in generateConfig.lua update these parameters:

commonParameters = {
      decNet = 'c_mnist_-1_1_25_net_G.t7',  -- path to the generator network
      encZnet = 'encoder_c_mnist_-1_1_4epochs.t7', -- path to encoder Z network
      encYnet = '', -- path to encoder Y network
      loadSize = {1, 32, 32},   -- image dimensions CxHxW  used as input (output) by the encoders (generator).
      gpu = 1,                  -- gpu mode. 0 = CPU, 1 = GPU
      nz = 100,                 -- Z latent vector length 
  }

Let me know if this works for you. Also, I'd try to look at the encoder training to see why is not working for MNIST. In any case, I'd encourage you to replicate CelebA results instead of MNIST, since that training definitely works and the results are more interesting. The drawback is that CelebA requires much more computation time.

Thank you so much for your most patient reply! I will try your model right away. Thanks for your enlightening paper, too.

Hi Guim, just asking, have you fixed the bug in trainEncoder for mnist? I think IcGAN will be quite helpful for my work, and I really want to use mnist to get some results quickly. So if not, maybe we can fix the bug together.

Also, last time I incorrectly made modifications to reconstructWithVariations.lua. The correct adaption looks like:

-- local inputX = torch.Tensor(opt.nImages, opt.loadSize[1], opt.loadSize[2], opt.loadSize[3])
-- above is wrong, rewrite as following
if opt.dataset='mnist' then channels = 1 else channels = 3 end
local inputX = torch.Tensor(opt.nImages, channels, opt.loadSize[2], opt.loadSize[3])

Yet, with the correct implementation, the results still reveals trivial solutions, that latent information is totally ignored and only label Y is taken into account by the generator. Please refer to the follow:

Incase the image can't be loaded: http://7xpg2f.com1.z0.glb.clouddn.com/encoder_disentangle.png
generated

This is the data generated by generateEncoderDataset.lua, note that the digits of the same label look all the same, indicating that latent Z has been ignored.

Incase the image can't be loaded: http://7xpg2f.com1.z0.glb.clouddn.com/gen_sample.jpeg
gen_samples

commented

Thanks for taking the effort to write these comments.
Unfortunately, I don't have much free time available so I can't immediately help you find the bug. Maybe I'll be able to do so in a couple of weeks, just to let you know.
I'd tell you that you try to adapt the CelebA pipeline to your own, since MNIST clearly has something wrong.
Sorry about that and thanks to use my code!