This is an attempt to create a modern Multi-View Stereo (MVS), which is modern in two meanings:
- The code should be written using modern standards (at least C++11) and modern practices to make it safe, easy to understand and maintainable (e.g. COLMAP).
- The algorithms should be updated with state-of-the-art methods, like ACMM/ACMP. However, official implementations of these algorithms are unreadable and full of bugs.
The goal of this repository is to bring together the best of both worlds. MVS algorithms take a set of calibrated images (i.e. with both intrinsic and extrinsic parameters) and produce pixelwise estimates of depth, normal and cost, which can then be back-projected in 3D to obtain a full point cloud. For example, currently the code produce the following output for a scene from https://www.eth3d.net/:
The code has been tested in Ubuntu 20 with a NVIDIA Tesla K80 GPU, but it should work with any reasonably recent version of CUDA and other required libraries.
Currently, there are the following dependencies:
cuda-api-wrappers
, which is included inlib/
and linked statically. This is just a modern wrapper around CUDA host-side API, which we all know being awfully old-fashioned.RapidJson
is required to parse the configuration file. It can be easily installed withapt
and it could even be removed in the future, if another file format is chosen.OpenCV
is needed to handle images IO and simple operations (such as resizing). I don't see a future where this can be removed, but it is very easy to install withapt
.Eigen
is used for linear algebra. The main reason I decided to add Eigen is because it can be used inside CUDA kernels and it makes the code much more readable.OpenCV
cannot be used in device-side code and raw arrays (e.g. in COLMAP) make things much less user-friendly (even if probably they are slightly faster).PCL
is used only to store and save the resulting point cloud. I use this for convenience, but in theory it could be removed as PLY files are easy to write anyway.
I use CMake as build system and the code can be built by simply running ./scripts/bash/make.sh
. This will take care of the classical CMake pipeline and it should adapt to the available number of cores.
Currently, you need to run the code by executing the binary in the build/
folder and passing the configuration file as the only argument. There are plans to modify this in the future (see TODO list). The input folder is assumed to be structured as required by MVSNet (https://github.com/YoYo000/MVSNet), mainly for compatibility. The correct format can be generated with existing repositories (https://github.com/morsingher/sfm_to_mvs), starting from the output of the COLMAP Structure-from-Motion library. The code will dump its output in the specified output folder.
I am more than happy to welcome any kind of suggestion or contribution. You can do it with both Github tools (such as Issues and Pull Requests) or by contacting me at marco.orsingher@unipr.it
. Please format code using the .clang-format
file provided. This can be done by running ./scripts/bash/format.sh
.
This repository is a work in progress and above all it serves for learning purposes. You can start to fix the TODOs below or you can read the code and point out other weak spots.
In order to increase stability and correctness:
- Write a bash script that can be run as
./scripts/bash/run.sh <data_folder>
and it takes care of everything - Look for TODOs in the code and fix them. They are either bad or unclear parts
- Does
io_utils.*
make sense or should it just be removed? - Headers should be included with correct folder and only where needed
- Set pointers as
__restrict__
in CUDA code, where applicable - Pick a design choice between C++ exceptions and
if
/else
statements - Classes are designed poorly (i.e. just to make them work) and should be improved
- Include unit tests where possible to verify code correctness
- Start Doxygen documentation
- A lot of stuff in CUDA kernels could be moved to the configuration file. But should we do it? I don't know
In order to add new features:
- Implement planar priors from https://arxiv.org/abs/1912.11744
- Implement multi-scale approach from https://arxiv.org/abs/1904.08103
- Include global refinement from https://github.com/rossimattia/depth-refinement-and-normal-estimation (this requires to modify refinement code to correctly rotate and store normals)
- Offer keypoint initialization
- Experiment with depth-normal consistency term
I would like to thank the authors of https://github.com/GhiXu/ACMMP, https://github.com/kysucix/gipuma and https://colmap.github.io/ to open source their work and their papers, from which I took a lot of inspiration. Moreover, thanks to eyalroz
for his awesome CUDA wrapper: https://github.com/eyalroz/cuda-api-wrappers!