SizheAn / PanoHead

Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Am I generating colors correctly?

kylemcdonald opened this issue · comments

I would like some advice on extracting a voxel representation for generating colors.

I was able to extract very poor vertex color by editing these lines in the G.sample_mixed loop in gen_videos_proj_withseg.py:

sample_result = G.sample_mixed(...)
sigmas[:, head:head+max_batch] = sample_result['sigma']
color_batch = G.torgb(sample_result['rgb'].transpose(1,2)[...,None], ws[0,0,0,:1])
colors[:, head:head+max_batch] = np.transpose(color_batch[...,0], (2, 1, 0))

If I look for the nearest color on the isosurface mesh, it gives me this:

Screenshot 2023-07-25 at 11 03 21

But when I look at the render I see this:

Screenshot 2023-07-25 at 13 20 13

I realize that the render has a final superresolution pass that makes it so clear, but I feel like I might be missing something.

My understanding of the process is something like:

  1. G.sample_mixed takes the samples (xyz coordinates in a 3d grid) and the transformed_ray_directions_expanded (which is just 0,0,-1) and w (which is the latent vectors of shape (14,512) from the mapping network output, combining latent and camera pose) and then outputs a few results (sigma, rgb, and a copy of xyz).
  2. The rgb is not actually rgb, but it is a 32 dimensional feature vector. So we have to decode it to RGB using the G.torgb network. This is what I find tricky. The network seems designed to process 2D images, but here we only have a bundle of N=10M feature vectors. So I pass it in a 10Mx1 image, and I hope this is ok. Also, torgb expects only a single w from the 14 options. I just picked the first one ws[0,0,0,:1] but I'm not sure if this is correct. Would it be better to run torgb for each w and then average them, or find the median, or something else?
  3. Finally, I convert these resulting colors back to voxel space and then use the mesh vertex locations to lookup the closest color.

My questions are:

  1. Is it ok to give torgb a 10Mx1 image or is this damaging the performance of the feature-to-color conversion?
  2. Is it ok to only use the first ws or should I be using multiple ones somehow? Are each of the ws latents representing a different camera pose, or do they represent something else?

Thanks @SizheAn!

你好,请问下你这个3d模型的颜色是如何生成的,为什么我的3d模型没有颜色

mix colors according to the 'normal' vector. 2 images are enough to create. full back and front.

Hi, how did you get the RGB texture ?

mix colors according to the 'normal' vector. 2 images are enough to create. full back and front.

@MustafaHilmiYAVUZHAN thanks for your input. Do you have any reference code for this? When you say normal vector, do you mean the w vector? Should I run multiple w vectors through torgb and then take an average or something?

Hi! Very interesting attempt! Could you please share the code of "look for the nearest color on the isosurface mesh"? Thanks a lot!

Followed by this code, I got the mesh, but the color seems not right.
Notice that the colors is in range[-3.1344, 3.2253] so i clamp it to [-1, 1]. Then i turn it into [0, 1]
image