dberga / nerfstudio

A collaboration friendly studio for NeRFs

Home Page:https://docs.nerf.studio

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage

dberga opened this issue · comments

Nerfstudio fails with images with different size regardless if they were correctly processed by colmap or hloc.

The transforms.json when converted considers 1 camera:
https://github.com/dberga/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L461
Thus, any transforms.json that consider the first camera but different image sizes will crash on training (error on matrix size)

RuntimeError: Error(s) in loading state_dict for NerfactoModel:                                                                                                                                            
        size mismatch for field.embedding_appearance.embedding.weight: copying a param with shape torch.Size([10, 32]) from checkpoint, the shape in current model is torch.Size([17, 32]).                
        size mismatch for camera_optimizer.pose_adjustment: copying a param with shape torch.Size([10, 6]) from checkpoint, the shape in current model is torch.Size([17, 6]).

A possible solution is to add a padding mechanism (conforming a unique size for all images, padding with zeroes to the smaller images). This can be agnostic to dataset.

Here is an example of transforms.json from LandMark https://github.com/InternLandMark/LandMark

single focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "fl_x": 427,
    "fl_y": 427,
    "w": 547,
    "h": 365,
    "frames": [
        {
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

multi focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "frames": [
        {
            "fl_x": 1116,
            "fl_y": 1116,
            "w": 1420,
            "h": 1065,
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

Another solution (padding + mask path) nerfstudio-project#1465

An issue regarding colmap to transforms
nerfstudio-project#2784