apple / ml-neuman

Official repository of NeuMan: Neural Human Radiance Field from a Single Video (ECCV 2022)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to render from pre-defined test views?

xiyichen opened this issue · comments

I have trained a model from a moving camera, but now I want to render test views from a different camera from another static viewpoint that is recording simultaneously. Since I am not using the frames from the second camera in COLMAP, what are the necessary steps to preprocess these frames for testing?

Also, in the given preprocessing steps (gen_run.py), you extracted masks twice (once before sparse scene reconstrution and once after), I was wondering why is it necessary to pass the masks into feature_extractor of COLMAP, since this is not required by some other dynamic NeRF methods.

  1. Render from test views: First, you need to register the second camera into the scene coordinate, you can use Hloc(https://github.com/cvg/Hierarchical-Localization) or other visual localization pipelines, or get the reconstruction from COLMAP by using the images from all cameras. Second, you also need to know the human pose of the test view, you can either estimate it for the test view, or use the pose at the same time stamp from the first camera. Third, change the camera and human pose accordingly, then you will be able to render and compute PSNR/SSIM.
  2. Masks: It's just a trick to make the SfM pipeline more robust to dynamic objects.

Thanks a lot!

Just a follow-up question, how many of the pre-processing steps (segmentations, depth_maps, keypoints, mono_depth, smpl predictions, point clouds) are required for pre-defined test views? I am trying to use ground-truth camera parameters for the test cameras and I don't have them registered in the sfm, so they don't have point clouds and depth maps generated from colmap. Would that be a problem for rendering from test views? What should I use for the near/far bounds and smpl-to-scene alignment for these test views?

if just for rendering purpose, the following parameters are required: 1. camera pose, 2. camera intrinsics, 3. aligned human pose.
As long as you can get the 6DoF camera poses of the test views, it should be fine.
For near/far, you can use the point cloud from training set to generate near/far.