astra-vision / CoMoGAN

CoMoGAN: continuous model-guided image-to-image translation. CVPR 2021 oral.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

equation (7) in the paper

NguyenTriTrinh opened this issue · comments

Hi, I think your work is really interesting! I have a question about the equation(7) in the paper, the h^Y and h^y_M are summed up by three kinds of features, respectively, but in the codes they are summed up by four kinds of features . Did i misunderstand something?

https://github.com/cv-rits/CoMoGAN/blob/dd3824715152f6464a95c99dd6f936744992b122/networks/backbones/comomunit.py#L145

No, you're right. Actually, thanks a lot for the issue, I think this could be specified better so I'll include a note in the readme.

We experimented some architectures for the DRB and we noticed that adding some residual connection after FIN improved stability of training. Hence, we can formalize the h^\phi feature as the composition of the output of two residuals, of which one has FIN layers. This both allows for continuous encoding of features and better training of the network. We omitted this detail for the sake of simplicity. To be clearer, we could formalize the output features as

# h^Y_M = h^E_M + h^\phi + h^x
physical_output_features = physical_features + (continuous_features + common_features) + input_features

If it's clear I'm closing the issue, otherwise I'm keeping it open to discuss.

I get it, thx~