Bad result in Neural3DV dataset

Question

Bad result in Neural3DV dataset

adkAurora opened this issue 10 months ago · comments

I use Neural3DV dataset sear_steak, follow the dataset conversion scripts in neural3dv_to_easyvolcap.py to generate yml files, and train I3mhet use config in configs/exps/l3mhet/l3mhet_sear_steak.yaml, the result is so bad which val psnr is only 6.36

2023-12-19 10:44:27.029948 easyvolcap.runners.evaluators.volumetric_video_evaluator -> summarize:                                             volumetric_video_evaluator.py:79
                           {                                                                                                                                                  
                               'psnr_mean': 6.362307601504856,                                                                                                                
                               'psnr_std': 0.16340760990474038,                                                                                                               
                               'ssim_mean': 0.018442549,                                                                                                                      
                               'ssim_std': 0.0059690047,                                                                                                                      
                               'lpips_mean': 0.676931189166175,                                                                                                               
                               'lpips_std': 0.010196555702566121                                                                                                              
                           }

Zhen Xu · Answer 1 · Tue Dec 19 2023 17:24:22 GMT+0800 (China Standard Time)

Hi, could you please paste a sample output image here? There are several other things to check if the results does not look good.

Check whether the camera parameters look reasonable: we provide a script for visualizing the cameras: scripts/tools/visualize_cameras and it should output a ply file to give a rough visualization of camera parameters.
Check with inferenece based method like ENeRFi. You can directly run rendering with ENeRFi in the GUI, try tuning near, far and bounds to see if the result gets better.
Train a static model to see if we can converge: by appending a configs/specs/static.yaml in your experimentation configuration (located in exps). Or just add dataloader_cfg.dataset_cfg.frame_sample=0,1,1 val_dataloader_cfg.dataset_cfg.frame_sample=0,1,1 in any command line regarding the dataset.

adkAurora · Answer 2 · Tue Dec 19 2023 17:40:43 GMT+0800 (China Standard Time)

Hi ~ thanks for your reply

I generated the cameras.ply, everything looks okay here.
The output result is total empty , here is the error.png of frame one

Zhen Xu · Answer 3 · Wed Dec 20 2023 01:06:16 GMT+0800 (China Standard Time)

Sorry for the late reply!
I double-checked the pre-processing script and found that we internally used a different conversion path (neural3dv -> nerfstudio -> easyvolcap) thus the neural3dv_to_easyvolcap script was not thoroughly tested.
In my latest commit this issue should have been fixed and you should be able to train a l3mhet model on the dataset correctly after converting with neural3dv_to_easyvolcap.

I recommend checking the implementation by training on a single frame first:

# Train on the first frame
evc -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml exp_name=l3mhet_sear_steak_static runner_cfg.save_latest_ep=1 runner_cfg.eval_ep=1 runner_cfg.resume=False

# Render spiral path
evc -t test -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml,configs/specs/spiral.yaml exp_name=l3mhet_sear_steak_static val_dataloader_cfg.dataset_cfg.render_size=540,960

# Fuse depth maps for visualization
python scripts/tools/volume_fusion.py -- -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml exp_name=l3mhet_sear_steak_static val_dataloader_cfg.dataset_cfg.ratio=0.05

Another recommended way to check the camera parameters is to render an enerfi model on the dataset:

# Construct the experiments manually and render on GUI
evc -t gui -c configs/base.yaml,configs/models/enerfi.yaml,configs/datasets/neural3dv/sear_steak.yaml,configs/specs/vf0.yaml exp_name=enerfi_dtu model_cfg.sampler_cfg.n_planes=32,8 model_cfg.sampler_cfg.n_samples=4,1  viewer_cfg.window_size=540,960

Could you please check whether the issue has also been fixed on your end?

adkAurora · Answer 4 · Wed Dec 20 2023 13:31:27 GMT+0800 (China Standard Time)

Thank you for your attention and effort～
I have tried the new code, there are some new problems.

Training on a single frame l3mhet_sear_steak_static can get reasonable result with mean psnr about 35 ,but I have a new question about val render result as blow, what are the strange vertical lines inside the green box?
Training on all frames consistently fails during the initial dataset loading stage. The process of loading the entire dataset twice is not only extremely time-consuming but is also prone to termination. Do you have any suggestions on how to solve this issue? I have attempted the process three times, and it was terminated each time. I remember that in the previous version of the code, the training images were loaded just once at the start, while the validation images were loaded later on.


EasyVolcap#  evc -c configs/exps/l3mhet/l3mhet_sear_steak.yaml
2023-12-20 12:55:11.475942 easyvolcap.scripts.main -> preflight: Starting experiment: l3mhet_sear_steak, command: train                                             main.py:80
2023-12-20 easyvolca… Loading imgs bytes for neural3dv/sear_steak/images TRAIN 100% ━━━━━━━━━━ 6,300/6,3… 0:14:30 < 0:00:00 8.316     p…
13:09:42.… ->                                                                                                                                                     it/s        
           load_resi…                                                                                                                                                         
2023-12-20    easyvolcap.da… Caching imgs for neural3dv/sear_steak TRAIN  22% ━━╸━━━━━━━━━━ 1,414/6,300  0:06:53 < 7:30:38 0.181 it/s v…
13:16:36.386… -> load_bytes:                                                                                                                                                 
Killed

Zhen Xu · Answer 5 · Wed Dec 20 2023 13:54:24 GMT+0800 (China Standard Time)

Hi @adkAurora, thanks for the follow up!

The vertical line looks like the bounding box we manually defined. The NeRF based models in EasyVolcap will only sample points inside the bounding box, and the sampling behavior outside of the bbox is undefined. This should mean that we've set a bounding box that's too small. You can try tuning the bounds inside configs/datasets/neural3dv/neural3dv.yaml.
Most of the time if a process is killed without warning, we're consuming too much memory (RAM). Try setting the swap size a little larger? EasyVolcap will cache the input images as jpeg bytes inside the main memory. For 20000 1K images, this should require around 20GB in my experiences.

adkAurora · Answer 6 · Thu Dec 21 2023 11:36:15 GMT+0800 (China Standard Time)

problem solved, Thanks~