NVlabs / FUNIT

It seems that during training, we use the same dataloader for content and style image generation. Say we have class A and class B, there is a chance that we have a content image from A and a style image from A, too. Is that resonable?

Also, how to define the "num_classes" in the discriminator? Is it equal to the number of classes in the whole training set? What if I use different classes in content data loader and style data loader?

Re: there is a chance that we have a content image from A and a style image from A, too. Is that reasonable?

Yes, this is intended. Let's say the two images are two husky dogs with two different colors. This will be simply a same-domain translation task. It is still a valid translation task, just a bit simpler. But I believe there are cases where we want to separate the content dataset and style dataset.

Re: Also, how to define the "num_classes"

This should be the number of classes in the style datasets. This parameter is used by the discriminator.

We have a new implementation for the repo that we are actively maintaining. You are welcome to take a look
https://github.com/NVlabs/imaginaire

FUNIT/funit_model.py

Line 39 in 198f430

_, xa_gan_feat = self.dis(xa, la)

But it seems that the content image will be the input of the discriminator. So I suppose num_classes = set(class_content + class_style). Otherwise, the content and style classes probably would share the same classifier. Say content class A and style class B. If we set num_classes=1, then the output channel is only 1.
@mingyuliutw

do we need to separate content and style image during training？