PatchFusion

An End-to-End Tile-Based Framework
for High-Resolution Monocular Metric Depth Estimation

Zhenyu Li, Shariq Farooq Bhat, Peter Wonka.
KAUST

DEMO

Our official huggingface demo is available here! You can test with your own high-resolution image, even without a local GPU! It only takes 1 minute for depth prediction plus ControlNet generation!

Thanks for the kind support from hysts!

Environment setup

The project depends on :

pytorch (Main framework)
timm (Backbone helper for MiDaS)
ZoeDepth (Main baseline)
ControlNet (For potential application)
pillow, matplotlib, scipy, h5py, opencv (utilities)

Install environment using environment.yml :

Using mamba (fastest):

mamba env create -n patchfusion --file environment.yml
mamba activate patchfusion

Using conda :

conda env create -n patchfusion --file environment.yml
conda activate patchfusion

Pre-Train Model

Download our pre-trained model here, and put this checkpoint at nfs/patchfusion_u4k.pt as preparation for the following steps.

If you want to play the ControlNet demo, please download the pre-trained ControlNet model here, and put this checkpoint at nfs/control_sd15_depth.pth.

Gradio Demo

We provide a UI demo built using gradio. To get started, install UI requirements:

pip install -r ui_requirements.txt

Launch the gradio UI for depth estimation or image to 3D:

python ./ui_prediction.py --model zoedepth_custom --ckp_path nfs/patchfusion_u4k.pt --model_cfg_path ./zoedepth/models/zoedepth_custom/configs/config_zoedepth_patchfusion.json

Launch the gradio UI for depth-guided image generation with ControlNet:

python ./ui_generative.py --model zoedepth_custom --ckp_path nfs/patchfusion_u4k.pt --model_cfg_path ./zoedepth/models/zoedepth_custom/configs/config_zoedepth_patchfusion.json

User Inference

Put your images in folder path/to/your/folder

Run codes:

python ./infer_user.py --model zoedepth_custom --ckp_path nfs/patchfusion_u4k.pt --model_cfg_path ./zoedepth/models/zoedepth_custom/configs/config_zoedepth_patchfusion.json --rgb_dir path/to/your/folder --show --show_path path/to/show --save --save_path path/to/save --mode r128 --boundary 0 --blur_mask

Check visualization results in path/to/show and depth results in path/to/save, respectively.

Args

We recommend using --blur_mask to reduce patch artifacts, though we didn't use it in our standard evaluation process.
--mode: select from p16, p49, and rn, where n is the number of random added patches.
Please refer to infer_user.py for more details.

Citation

If you find our work useful for your research, please consider citing the paper

@article{li2023patchfusion,
    title={PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation}, 
    author={Zhenyu Li and Shariq Farooq Bhat and Peter Wonka},
    year={2023},
    eprint={2312.02284},
    archivePrefix={arXiv},
    primaryClass={cs.CV}}

DavorJordacevic / PatchFusion