Degenerate issue when conditioned by an image

Question

Degenerate issue when conditioned by an image

BoltenWang-Meta opened this issue 8 months ago · comments

Thx for impressive work and code. So, after checking the inference code under the case conditioned by an image, the whole generation just degenerates into a SSL representation conditioned MAGE. Is my understanding true? If yes, does it mean RDM is not used during inference?

Tianhong Li · Answer 1 · Wed Jan 31 2024 21:13:13 GMT+0800 (China Standard Time)

Thanks for your interest! For Figure 6 and 7 in the paper (which corresponds to the case conditioned by ground-truth images), the RDM is NOT needed because the SSL representation is provided by ground-truth image. However, we note that such a case has a strong limitation in practice, as typically we don't have ground-truth images when we want to generate an image. Therefore, we need the RDM to generate the SSL representations under common generation scenarios where we don't have ground-truth images

JunchengWang · Answer 2 · Wed Jan 31 2024 21:36:01 GMT+0800 (China Standard Time)

Got it! I agree with generating images conditioned by GT seems to be useless. I was just wondering where does RDM go in viewing your code. Plus, thx for so quick reply, what a dedicated author you are. hhhh

Tianhong Li · Answer 3 · Wed Jan 31 2024 21:41:29 GMT+0800 (China Standard Time)

We integrate the RDM sampling process in the pixel generator. For example, you can check it here in the MAGE generator: https://github.com/LTH14/rcg/blob/main/pixel_generator/mage/models_mage.py#L485-L508. I just happened to see this issue pop out -- hope my response can solve your questions.