hetolin / PourIt

Code for "PourIt!: Weakly-supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring" ICCV2023

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PourIt!: Weakly-Supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring

architecture

πŸ“ Overview

This repository contains the PyTorch implementation of the paper "PourIt!: Weakly-Supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring" [PDF] [Supp] [arXiv]. Our approach could recover the 3D shape of the detected liquid.

For more results and robotic demos, please refer to our Webpage.

πŸ“š Dependencies

  • Python >= 3.6
  • PyTorch >= 1.7.1
  • CUDA >= 10.1

βš™οΈ Installation

conda create -n pourit python=3.6
conda activate pourit

pip install -r requirements.txt

πŸ“Š Prepare Dataset

Using PourIt! dataset

Download PourIt! dataset. Unzip and organize these files in ./data as follows,

data
└── PourIt
    β”œβ”€β”€ seen
    └── unseen

Optional: Using your own dataset

If you want to use your own data, please organize the files as follows,

data
└── PourIt_additional
    β”œβ”€β”€ ori_scene1
    β”‚    β”œβ”€β”€ water
    β”‚    β”‚    β”œβ”€β”€ 000000_rgb.png
    β”‚    β”‚    β”œβ”€β”€  ...
    β”‚    β”‚    └── 000099_rgb.png
    β”‚    β”‚
    β”‚    └── water_no
    β”‚        β”œβ”€β”€ 000000_rgb.png
    β”‚        β”œβ”€β”€  ...
    β”‚        └── 000099_rgb.png
    β”‚        
    β”œβ”€β”€ ori_scene2
    β”œβ”€β”€ ori_scene3
    β”œβ”€β”€ ...
    └── ori_scene10

Note: The water folder stores the RGB images with flowing liquid, while the water_no folder stores the RGB images without flowing liquid.

Then run the pre-processing code to process your own data (If you are using the PourIt! dataset, please skip this step).

# For example, modify the code
SCENE_NUM = 10

# Then execute
python preprocess/process_origin_data.py

⏳ Training

Download the ImageNet-1k pre-trained weights mit_b1.pth from the official SegFormer implementation and move them to ./pretrained.

# train on PourIt! dataset
bash launch/train.sh

# visualize the log
tensorboard --logdir ./logs/pourit_ours/tb_logger --bind_all

# type ${HOST_IP}:6006 into your browser to visualize the training results

# evaluation on PourIt! dataset (seen and unseen scenes)
bash launch/eval.sh 

πŸͺ„ Demo

Download the PourIt! pre-trained weights and move it to the ./logs/pourit_ours/checkpoints

Demo1 (Online 2D liquid detection)

  1. Set up your camera, e.g. kinect_azure or realsense
# launch ROS node of kinect_azure camera
roslaunch azure_kinect_ros_driver driver.launch
  1. Run the demo.py
python demo.py

# online 2d liquid prediction
predictor.inference(liquid_2d_only=True)

Then you will observe the results of 2D liquid detection, similar to the following,

Demo2 (Online 3D liquid detection)

  1. Set up your camera, e.g. kinect_azure or realsense
# launch ROS node of kinect_azure camera
roslaunch azure_kinect_ros_driver driver.launch
  1. Launch your pose estimator, e.g. SAR-Net, to publish the estimated pose transformation of the container '/bottle' versus '/rgb_camera_link' (kinect_azure) or '/camera_color_frame'(realsense)

  2. Run the demo.py

python demo.py

# online 3d liquid prediction
predictor.inference()

Demo3 (Offline 3D liquid detection)

# If you don't have a camera available on-the-fly, you can run our example in offline mode.

# offline 3d liquid prediction
predictor.process_metadata_multiple('./examples/src')

The reconstructed 3D point cloud of liquid will be saved in ./example/dst.

πŸ”– Citation

If you find our work helpful, please consider citing:

@InProceedings{Lin_2023_ICCV,
    author    = {Lin, Haitao and Fu, Yanwei and Xue, Xiangyang},
    title     = {PourIt!: Weakly-Supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {241-251}
}

🌹 Acknowledgment

Our implementation leverages the code from AFA. Thanks for the authors' work. We also thank Dr. Connor Schenck for providing the UW Liquid Pouring Dataset.

About

Code for "PourIt!: Weakly-supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring" ICCV2023

License:Apache License 2.0


Languages

Language:Python 99.4%Language:Shell 0.6%