Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage
dberga opened this issue · comments
Issues posted in:
Nerfstudio
nerfstudio-project#3059
nerfstudio-project#3057
SDFstudio
autonomousvision/sdfstudio#307
Nerfstudio fails with images with different size regardless if they were correctly processed by colmap or hloc.
The transforms.json
when converted considers 1 camera:
https://github.com/dberga/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L461
Thus, any transforms.json
that consider the first camera but different image sizes will crash on training (error on matrix size)
RuntimeError: Error(s) in loading state_dict for NerfactoModel:
size mismatch for field.embedding_appearance.embedding.weight: copying a param with shape torch.Size([10, 32]) from checkpoint, the shape in current model is torch.Size([17, 32]).
size mismatch for camera_optimizer.pose_adjustment: copying a param with shape torch.Size([10, 6]) from checkpoint, the shape in current model is torch.Size([17, 6]).
A possible solution is to add a padding mechanism (conforming a unique size for all images, padding with zeroes to the smaller images). This can be agnostic to dataset.
Here is an example of transforms.json
from LandMark https://github.com/InternLandMark/LandMark
single focal example
{
"camera_model": "SIMPLE_PINHOLE",
"fl_x": 427,
"fl_y": 427,
"w": 547,
"h": 365,
"frames": [
{
"file_path": "./images/image_0.png",
"transform_matrix": []
}
]
}
multi focal example
{
"camera_model": "SIMPLE_PINHOLE",
"frames": [
{
"fl_x": 1116,
"fl_y": 1116,
"w": 1420,
"h": 1065,
"file_path": "./images/image_0.png",
"transform_matrix": []
}
]
}
Merged in nerfstudio-project@db93476
Another solution (padding + mask path) nerfstudio-project#1465
An issue regarding colmap to transforms
nerfstudio-project#2784