itamar8910 / 3DInteractionClassification

A project in 3D Imagery. Detect when two people are looking at / touching each other.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

3D vision course final project

This project was built in Haifa university 3D vision course and lead by Prof. Hagit Hel-or.
In this project we aim to detect diffrent types of people interaction in 2 approaches.

Walkthrough:

  1. Problem definition
  2. Input
  3. First approach - 3D reconstruction
  4. Second approach - 3D estimation
  5. Output demo
  6. Video demo
  7. Running instructions
  8. Credits

Problem definition

Problem: Given 4 videos from 4 different cameras,
We aim to detect the type of interaction and when it happend.
We currently support 4 types of interaction

  • No interaction
  • Touch
  • Look
  • Both

Main challenge: The system cannot assume anything about the scene

Input

We receive as input: 4 RGB videos taken from different corners.

First approach - 3D reconstruction


Our first approach was to reconstruct the 3D scene and to find the distance between each 2 "looking" vectors of people in the scene. To do this, we first calibrate our 4 cameras.

Then, we perform the following steps:

  1. Detect person using openpose
  2. Recognize person's identity
  3. find (x,y) coordinates of both eyes and noise
  4. Find (x,y,z) coordinates from two cameras
  5. Find each person’s face plane
  6. Get plane's normal => looking direction
  7. Classify interaction

This approach worked poorley.
This is because the many errors occurred while calibrating. We wanted a different approach, that does not require calibration.

Second approach - 3D estimation

Instead of tring to reconstructe the 3D dimension scene, we tried to estimate it. We used deep learning and Image proccssing algorithms in order to approximate the 3D scene.

Our detection steps:
1. Detect person using tinyFaces: using deep learning classification(tinyFace model - CNN architecture) we can find the face dimensions of people in the scene with high accuracy even in very small and low resolution.

2. Recognize person’s Identity: Using deep learning one shot person recognition and a fall back to HSV color detection we can identify the people in the scene(With a few pictures taken from them priorly)

3. Find distance of person from camera: Using the size of the person face and its body proportion extracted with openpose, we can estimate the user distance from the camera.

4. Get looking direction: Using gazer library which uses advanced deep learning and image proccessing techniques, we can find the gaze of the person. We also implemented a fall back heuristic based on the persons face - noise location to handle failure of gazer.

5. Classify interaction: finally, we can find the L2 distance between the two vectors from the peoples noise, and below a certain threshold alaram as interaction detection.

Output demo

We were finally able to detect "looking" interaction. Whether two people are looking at each other, who are the people and how confident are we about the classification.

We then tested our system on multiple people and got great results. We can handle well as many people as fit in our camera scene.

To handle touch detection, we leverage OpenPose skeleton.
For each camera, we check if the distance between the skeletons of the people are below a certian threshold. Ff this is correct for all cameras we can be sure its a touch. After evaluation, this method works well and can detect touch for multiple people.

Video Demo

A short video demonstrating our results.

Running instructions

Please refere to our howto.txt for running instructions.

Credits

This project was built in the University of Haifa, 3D vision course.
This project was lead by Prof. Hagit Hel-or.

Project members:

Fill free to contact us.

About

A project in 3D Imagery. Detect when two people are looking at / touching each other.


Languages

Language:Python 100.0%