Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage

Question

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage

dberga opened this issue 4 months ago · comments

dberga commented 4 months ago

Issues posted in:

Nerfstudio
nerfstudio-project#3059
nerfstudio-project#3057

SDFstudio
autonomousvision/sdfstudio#307

Hloc
cvg/Hierarchical-Localization#383

dberga · Answer 1 · Wed Apr 17 2024 00:04:32 GMT+0800 (China Standard Time)

Nerfstudio fails with images with different size regardless if they were correctly processed by colmap or hloc.

The transforms.json when converted considers 1 camera:
https://github.com/dberga/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L461
Thus, any transforms.json that consider the first camera but different image sizes will crash on training (error on matrix size)

RuntimeError: Error(s) in loading state_dict for NerfactoModel:                                                                                                                                            
        size mismatch for field.embedding_appearance.embedding.weight: copying a param with shape torch.Size([10, 32]) from checkpoint, the shape in current model is torch.Size([17, 32]).                
        size mismatch for camera_optimizer.pose_adjustment: copying a param with shape torch.Size([10, 6]) from checkpoint, the shape in current model is torch.Size([17, 6]).

A possible solution is to add a padding mechanism (conforming a unique size for all images, padding with zeroes to the smaller images). This can be agnostic to dataset.

dberga · Answer 2 · Wed Apr 17 2024 00:14:27 GMT+0800 (China Standard Time)

Here is an example of transforms.json from LandMark https://github.com/InternLandMark/LandMark

single focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "fl_x": 427,
    "fl_y": 427,
    "w": 547,
    "h": 365,
    "frames": [
        {
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

multi focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "frames": [
        {
            "fl_x": 1116,
            "fl_y": 1116,
            "w": 1420,
            "h": 1065,
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

dberga · Answer 3 · Thu Apr 25 2024 22:27:05 GMT+0800 (China Standard Time)

Merged in nerfstudio-project@db93476

dberga · Answer 4 · Sat Apr 27 2024 04:10:42 GMT+0800 (China Standard Time)

Another solution (padding + mask path) nerfstudio-project#1465

dberga · Answer 5 · Thu May 23 2024 21:43:23 GMT+0800 (China Standard Time)

An issue regarding colmap to transforms
nerfstudio-project#2784