6D - Pose Annotation Tool (6D-PAT)

For detiled explanations checkout the WikiPage.

What is this program for?

With 6D-PAT you can create 6D annotations on images for 6D pose estimation, i.e. annotate 2D images with the 3D rotation and 3D translation of 3D models.

Quickstart (only for Ubuntu)

Get the latest AppImage from the releases page or use Docker.
Download zipped example data.
Unzip the zip file in a location of your liking. If you want to use Docker, remember to mount the data directory.
Give the downloaded AppImage permissions to be executed.
Execute with double-click.
Go to Settings -> Paths and point to the directories unpacked from the zip files. You should then be able to see the example data.

How does it work?

The program allows you to select a folder and view the images contained in it in a gallery. Selecting one of the images will display it at a larger scale to create new 6D pose annotations. The 3D models for those annotations are displayed in a second gallery which also loads the models from a specified folder. In the 3D viewer of the program, you can inspect a selected 3D model, rotate it and use it to create a new pose annotation.

The whole annotation process:

annotation.3.mp4

Browsing images and modifying poses:

intro.mp4

Objects and images are from the T-Less Dataset.

Getting the program

Running the AppImage

Note: The AppImage is built on Ubuntu 20.04 (the latest version) and thus requries you to have Ubuntu 20.04. You could try a virtual machine if you Ubuntu version doesn't match or build the program yourself.

You can download the latest AppImage from the releases page which contains everything the program needs to run and should work on the latest Ubuntu out of the box.

Run the Docker image

Enable X-server display for Docker:

xhost +local:root

Run the Docker image (command changed! omit /6DPAT at the end, this is now in the entrypoint):

docker run -ti --rm -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix -v /dev/dri/card0:/dev/dri/card0 florianblume/6dpat

Check out the getting the program wiki page for more details.

Build from source

Requirements

	OpenGL	Qt	OpenCV	Python	Pybind11
Version	>=3.1	>=5.14	>=4.5	==3.8	==2.6.2

For OpenGL, Qt and OpenCV these are the minimum versions you need to have installed. Python needs to be exactly 3.8 (C++ interface changes from version to version in Python) and I'm not sure about Pybind11 that's why I'm assuming you need exactly this version. You can do so this way for example (if you don't want to manually install Qt and build OpenCV):

sudo add-apt-repository -y ppa:beineri/opt-qt-5.14.2-focal
sudo apt-get update -qq
sudo apt-get -y install qt514-meta-minimal qt5143d qt514gamepad python3 python3-dev python3-pybind11
sudo apt-get -y install libopencv-dev

Then open the project's main 6d-pat.pro file in QtCreator and build the project. Everything should compile successfully. If not: Feel free to open an issue and I'll try to help you.

Setting up the program the first time

Check out the program setup wiki page to see in detail how to set up the program.

Recovering poses

To start recovering poses, follow these steps:

Select an image
Select the corresponding object model
Rotate the object model to a similar position as visible in the image
Click on the image on a characteristic point of the object
Click the same point on the 3D model - the program will show the number of click points at the bottom left
Repeat setps 4 - 5 until at least 6 correspondences were created - more correspondences help to make the pose more accurate
Click the "Create" button at the bottom of the pose editor
You should see the recovered pose on the image
You can refine it using the number fields or by dragging or rotating it directly with the mouse after selecting the 3D model
After pose refinement, don't forget to press "Save"

More steps and details are on the wiki page.

Citation

If you use my program in your research, please cite it using GitHub's citation functionality in the right menu bar.

About

6D - Pose Annotation Tool (6D-PAT) - is a tool that allows the user to load a set of images and also a set of 3D models and annotate where in the 2D image the 3D object ist placed.

GNU General Public License v3.0

Languages

Language:C++ 93.1%Language:Python 4.0%Language:QMake 1.4%Language:GLSL 1.2%Language:Dockerfile 0.2%