orpatashnik / StyleCLIP

Thanks for sharing this wonderful work! Here I am a little confused with the normalization layer:

Line 16 in adc836d

layers = [PixelNorm()]

In the StyleGAN2 mapper, as the input is always (Batch, latent_dim), the normalization layer is right here.

Line 16 in adc836d

return input * torch.rsqrt(torch.mean(input ** 2, dim=1, keepdim=True) + 1e-8)

But in the latent mapper, I think the input is always (Batch, n_latent, latent_dim)(Maybe I am wrong here), thus the normalization layer seems doesn't do the right normalization.
I think for stylespace mapper, this reshape is right:

StyleCLIP/mapper/latent_mappers.py

Line 98 in adc836d

x_c_res = curr_mapper(x_c.view(x_c.shape[0], -1)).view(x_c.shape)

But for the others, some operations that are not intended may occur as PixelNorm normalize the wrong dim.

StyleCLIP/mapper/latent_mappers.py

Lines 62 to 67 in adc836d

    
           x_coarse = x[:, :4, :] 
        
           x_medium = x[:, 4:8, :] 
        
           x_fine = x[:, 8:, :] 
        
           if not self.opts.no_coarse_mapper: 
        
               x_coarse = self.course_mapping(x_coarse)

Am I right or do I misunderstand something? Look forward to you reply, thanks.

	x_coarse = x[:, :4, :]
	x_medium = x[:, 4:8, :]
	x_fine = x[:, 8:, :]

	if not self.opts.no_coarse_mapper:
	x_coarse = self.course_mapping(x_coarse)

About Mapper in Latent Mapper, does PixelNorm normalize the right dim?