How to calculate the homography transform?

Question

How to calculate the homography transform?

HkDzl opened this issue 9 months ago · comments

Hello, in Chapter 5 of paper "Learned reconstructions for practical mask-based lensless imaging", it is mentioned that a homography transform is needed to co-align both cameras’ coordinate systems when capturing real dataset. Could you please provide more details on how to calculate this homography transform?

Eric Bezzam · Answer 1 · Wed Oct 11 2023 00:22:53 GMT+0800 (China Standard Time)

Hi @HkDzl thanks for your question. There are algorithms/libraries for calculating the homography transform, for example findHomography from OpenCV. Here's a nice tutorial on how to use this: https://www.geeksforgeeks.org/image-registration-using-opencv-python/

I'm not sure of the exact approach the authors of the paper use, but from what I understand, they align a reconstruction of points and the corresponding measurement from the lensed camera. The text from their paper says:

To achieve pixel-wise alignment between the image pairs, we first optically align the two cameras, then further calibrate by displaying a series of points on the computer monitor that span the field-of-view. We reconstruct these point images and compute the homography transform needed to co-align both cameras’ coordinate systems. This transform is applied to all subsequent images.

What I currently do is much simpler because all I care about is aligning my reconstructed image with the original image to then train a reconstruction algorithm. (There is probably a better approach which I why I would still leave this Issue open, and to also justify the approach done here):

Measure data with only a lensless camera, as shown in this paper (to be published as part of the proceedings of the 2023 Optica Imaging Congress), and with the procedure described here. This yields a dataset of lensless RGB measurements, like this one of 10K examples from the CelebA dataset with the lensless camera described in the above paper. The reasons for this setup are: (1) simpler (single camera to control, no beamsplitter) and (2) we avoid "imitating" any distortions of the lens as we are directly using the original images as the ground truth. (in the paper you referred to, they also had to correct for these distortions "calibrate the lens distortion using OpenCV’s undistort camera calibration procedure") .
Manually align the reconstruction with the original RGB dataset that was displayed on the screen, by shifting a simulated version of the original image and cropping the region of interest of both the reconstruction and the simulated version. This simulated version of the original image is obtained by simply rescaling it and padding it according to the physical dimensions of the setup. Simulator is created here and parameters are set here.
Finally check that the alignment looks alright by trying out some reconstructions, and overlay that on top of the simulated original image, as done here.

Doing this alignment requires some trial-and-error to find the correct amount of pixels to shift by and the region to crop out. But it only has to be done once, and for the CelebA dataset we've measured, those parameters are the defaults to our Dataset object for loading this dataset.

Below is an example of aligning the measurement with the ground truth image:

Raw data

Reconstruction

Original image simulated (and shifted) -> ground-truth for training

Reconstruction and ground-truth cropped and overlaid

And those outputs (reconstruction and simulated original both cropped) can then be used for training as in "Learned reconstructions for practical mask-based lensless imaging" with this script and the proper Hydra configuration 🚀

NOTE: A lot of this is work-in-progress in PR #96, but hopefully will be merged in the next two months (with documentation). But hopefully the above can be helpful for now, and happy to answer any other questions!

HkDzl · Answer 2 · Wed Oct 11 2023 11:09:47 GMT+0800 (China Standard Time)

Thank you very much for your very detailed answer.

Eric Bezzam · Answer 3 · Wed Apr 24 2024 21:34:42 GMT+0800 (China Standard Time)

I've prepared another example here: https://colab.research.google.com/drive/1c6kUbiB5JO1vro0-IMd-YDDP1g7NFXv3#scrollTo=MtN7GWCIrBKr

I will close this issue.