skhu101 / SHERF

Code for our ICCV'2023 paper "SHERF: Generalizable Human NeRF from a Single Image"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Positional Encoding for RGB

greatbaozi001 opened this issue · comments

Hi. I have read the paper and there is a question remains for me.
As the paper mentions, a 2D encoder is adopted to extract feature map f ∈ R^(64x256x256), and positional encoding is performed to the RGB values and the code is append to 2D feature maps to form f ∈ R^(96x256x256). How can I map RGB ∈ 3 to (96-64) with positional encoding?

Hi, thanks for your interest in our work. We use use positional encoding with the number of frequencies 5 to map RGB ∈ R^(3x256x256) to R^(33x256x256). Then we append the first 32 dimensions of RGB ∈ R^(32x256x256) to feature map f ∈ ∈ R^(64x256x256), which finally forms f ∈ R^(96x256x256).

thanks, the answer is clear!

Hi, Sorry for the reopening the issue. Is there any reason of design to pick the first 32 dimension of encoded RGB? Thanks!

Hi, we mainly hope to keep the dimension of 1D global, 2d pixel-aligned and 3d point features same so that it would be easier for later feature processing and fusion stage.