Description

This repo contains ROS nodes for object recognition from RGB-D data using Intel realsense d435. It has several steps:

segment rgb image and get mask of one object using Mask R-CNN for segmentation and KNN for classification with iterative learning
After applying mask on rgb and depth image get point cloud of an object
Find Oriented bounding box of point cloud (with outlier removal)
Find coefficients of a plane on point cloud of complete scene

Docker setup

Create workspace folder, src folder in it and clone this repo into it:

mkdir -p ~/ros_ws/src
cd ~/ros_ws/src
git clone https://github.com/IvDmNe/grasping_vision.git
cd grasping_vision

Change line 7 in run_docker.sh according to your workspace location: -v your_path_to_workspace:/ws \

If you run Realsense in Gazebo, then change topic in file scripts/metric_learning_segmentation_node.py at line 84 from aligned_depth_to_color to depth

Install nvidia-container-toolkit to use GPU in docker: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

Build docker image

sh build_docker.sh

Run

Run ROS core: roscore

(Opionally) Open rviz: rviz -d rviz_config.rviz

Start docker

sh run_docker.sh

Prepare ros project and build it:

cd ws
catkin_make
source devel/setup.bash --extend

Launch node for segmentation and bouding box calculating:

cd src/grasping_vision
roslaunch launch/launch_them_all.launch

Open another terminal and run command_node.py:

sudo docker ps (to get a name of running container)
sudo docker exec -it -w /ws/src/grasping_vision/scripts name_of_container bash
python3 command_node.py

Usage

In the command_node user can enter one of the following commands:

inference (default)
train {name of object}
give {name of object}

In inference mode the segmentation node segments image and classify each object.
In train mode the node stores all images of an object for 30 seconds and then feed deep features of them into KNN-classifier.
In give mode the image is segmented and the coordinates of bouding box of a desired object are sent once to a topic /obb_array. Once the "give" command was sent, the array is sent continously in topic until new "give" command is sent.

The topic /obb_array has float32 one-dimensional array representing a pose of the bounding box in the followong format: [major_vector, middle_vector, mass_center, dimensions] (in total 12 elements). Major vector represents X-axis, middle vector - Y-axis, dimensions - size of bounding box in a coordinate system, which axes are the major vector, middle vector and a vector maden by a cross product of two first ones.

In a base setup the algorithm is able to recognize 5 objects: cup, cleaner, mouse, regbi ball, rubik's cube and a realsense's box.

Examples of detection:

IvDmNe / grasping_vision

Description

Docker setup

Run

Usage

About

Languages