alievk / npbg

Neural Point-Based Graphics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train from scratch or fine-tune with multiply scenes datasets?

xdiyer opened this issue · comments

I want to train the model with my own dataset from scratch, and also want to fine-tune the model with 2 different scenes datasets as you did in paper about scene editing. Could you tell me how to modify the train.py or release the training code from scratch?Looking forward to your reply.Thank you.

Hi @xdiyer, for training from scratch you can just follow the guidelines in the Fitting descriptors section of the readme. You will only need to modify the configs for your scene (train_example.yaml, paths_example.yaml, scene.yaml). Similarly, for training the network on 2 scenes, you can specify both of them in the paths config under the main section datasets. The example would be:

datasets:
    "my_scene_1":
         scene_path: scene_1.yaml
         target_path: images_undistorted_1
         target_name_func: "lambda i: f'{i}.png'"
    "my_scene_2":
         scene_path: scene_2.yaml
         target_path: images_undistorted_2
         target_name_func: "lambda i: f'{i}.png'"

You can also look at other issues where training on the new scenes is described (e.g. the nice example is here).

thank u @seva100, In terms of fine tune the network on 2 scenes, is it necessary to change the image resolution of the two scenes to the same? For these two videos come from different cameras

We train a single scene with 1920x1080 resolution on a single 1080i card, batch size is 4. But when training two scenes with the same resolution, bsize can only be set to 1. What is the reason and how to enlarge bsize ?

I think the image resolution for the two scenes can be different.

What is the error you're getting when setting the larger batch size for two scenes?

cuda out of memory error
with single scene, we can train using dataloader_workers=4, max b_size =1 or dataloader_workers=1, max b_size =4 ,
Did the multi-scenes model you trained, such as 41 people in paper, have any memory problems before?

Cuda out of memory error makes sense here since with more scenes you will have a larger memory footprint (you need to store the descriptors of several scenes). So decreasing the batch size and use dataloader_workers=1 should help if the amount of available memory of your card is insufficient.

Perhaps you can also try reducing the --crop_size -- this will reduce the amount of GPU memory used.

@alievk can you please also comment in case I'm missing something?

--crop_size works, thanks.
High resolution image(1080p or higher) must be used when building point cloud (low resolution image will lead to very low quality of point cloud). considering GPU memory and training speed, I hope to reduce the image resolution when training the model. but I find that if the image resolution used before and after is inconsistent, the model will fail. Is there any suggestion about this?

I'm not sure I understood, in which way you'd like to reduce the resolution. Do you want to reduce the resolution of ground truth images during the network training? Because in this case, the result will also be more blurry. Though I think --crop_size would help you here as well. Did I get it right?

Yes, reducing the resolution of the original ground truth images during training will not only blur but also make the model completely unavailable. Now training with crop size can work, but it taks too long. I will study this issue in detail later. Thank you again for your suggestion

You can try increasing batch size with training with lower --crop_size or use several GPU if you have any. The code should support distributed training.

@seva100,I have trained a model with two scenes, both of which can be rendered using the network. In order to reproduce the scene editing, I manually aligned two point clouds in meshlab and saved them as a new point cloud. How to calculate the coordinate transformation matrix of the point cloud to facilitate the alignment of point descriptors, and how to adjust two camera parameters to render two scenes at the same time (as demonstrated by scene editing). I don't know much about point cloud and 3D, hope u can give some detailed guidance or sample code. Thanks again.

You can use the descriptors trained for the second point cloud while using its new vertex positions that you've manually aligned. Camera positions can be thus aligned with the first point cloud. In this case, you don't need to retrain anything -- you can just use the model trained with the two scenes.