sh_face_rec - Smart Home Face Recognition

A simple face recognition system that can be used with any streaming camera and works with OpenHAB via REST communication. Runs on a Raspberry Pi.

Runs a small flask based http server that gets video streams from IP cameras on network and stores frames in a queue.
Multiprocessing worker application process the video frames and run a face recognition procedure on each of the frames in pipeline
Pipeline (details see below):
- Image Handling w/ OpenCV
- Face Detection w/ MTCNN Tensorflow Implementation
- Aligning/Cropping using dlib
- Face Embedding with dlib CNN
- Classification with scikit KNN
Once known faces are identified OpenHAB is notified via REST Interface call (requires REST API binding in OH)
If only unknown faces are identified OpenHAB is notified via REST Interface call

Check the Wiki for detailed description of possible Face Detection/Recognition Frameworks incl. evaluation. Wiki

Face Processing Pipeline

This pipeline is heavily inspired by the OpenFace Pipeline. See further down for information on training the neural networks of each pipeline step. For basic image handling (reading/streaming, writing, transforming, extracing and showing) openCV is used.

Face Detection

As a first step we need to detect any shapes in the video frames that look like faces. Luckily there are many frameworks available which can do that. The challenge is to find “good enough” performance for my usecase (640x480 blurry and dark camera feed) and balance with resource requirements = processing FPS (frames per second). See The Wiki for the framework evaluation. I went for an MTCNN implemetation of a face detector. Implementation: davidsandberg's implementation of MTCNN for face detection Github

Face Alignment and Crop

Once I have all boundary boxes around the faces in the frame I use dlibs pose predictor to align all the faces and crop them to a standard format of 64x64 pixels. Implementation: Face Alignment and Landmark Extraction: dlib (pose_predictor 5point)

Face Embedding

This step creates a 128bit representation (embeddings) of every face that can then be used by a classifier to determine if we have a match or not. I went for the FaceNet approach of using a pretrained Neural Network to create these face embeddings. See the Wiki for the frameworks that I evaluated. I decided to use dlibs ResNet network to create 128bit vectors of the faces. Implementation: dlib CNN face encoder Dlib. Comes with pretrained network on 3Mio face images from VGG and facescrub dataset.

Face Classification

With the 128bit embeddings I can train a classifier to detect known faces. In this pipeline step every new detected face embedding is fed into the classifier to determine if we have a known face (threshold comparison) or not. Implementation: knn classifier from sklearn

Setup and Performance

The application is written to work on a Raspberry Pi3. Python3 is required for multiprocessing. (Multithreading is terribly slow) I used Anaconda as environment manager. Use the env.yml file in the root directory to set up an working Anaconda environment. Configuration is pretty self-explanatory in confi.ini

Image/Video Handling is done with openCV, so any streaming IP camera or file system videos should work (not tested) Performance with 640x480 videos (mjpg) on the RPi3:

3 FPS if no faces are in frame
0.7 FPS with one face in frame
0.4 FPS with 2 faces in frame

Repository Folder Structure

sh_face_rec: contains all the code
- align folder: mtcnn detector code (from facenet)
- config.ini: all config data for application
- logging.conf: logging configuration
- startserver.py is the main
models: contains pre-trained models for CNN and classifier models (need to be downloaded. links below.)
test: contains unit test for components and visual debugging testcases for whole application
testing_data: contains videos/images for testing
training_data: contains labeled faces for training classifier
openHAB: example code for openHAB integration

Startup

Preparation

Configuration: all config settings need to be done in config.ini. (use config_template.ini and rename)
Download pre-trained models for detector, face recognition (see below)
Train classifier on your target platform (due to cross-platform pickle issues)
configure model names in config.ini
configure uwsgi_start.ini
run tests first

Production (uWSGI Server)

Although the application is working with flask's internal werkzeug debug server, I recommend using a production server such as uWSGI
Face Recognition Server is started with ./start.sh (requires uwsgi as production server)
A new streaming and face recognition job can be started with curl -i -H "Content-Type: application/json" -X POST -d {'"URL":"CAM_URL", "Time": "10"}' SERVER_URL:SERVER_PORT/detectJSON
The whole http API is listed in startserver.py
config-file for uwsgi server is uwsgi_start.ini. Other servers such as gunicorn, gevent do not work with multiprocessing. The attribute processes should be >2 since the application forks 2 additional processes to main.

Training & Models

The face recognition pipeline uses Neural Networks for the first 3 steps. If you do not want to train the networks your own, you need to download pretrained models for each of the pipeline steps. Models need to be placed in model_path folder.
Configuration of which models to use is possible in facerecognizer.py Class Variables.
Default models:

Face Detection: 3 models for the o,r,n CNN required. Download from Facenet/Align Github
Face Alignment: depending on selsection, 5point or 68point shape predictor required. Download from dlib git
Face Encoding: The dlib face recognition ResNet model is required. Download from dlib git
Face classification: A trained KNN classifier is required. Use trainclassifier.py script and manual to create model and place in model_path.

Testing

For visual inspection of the pipeline the two testcases

test_facerecognizer
test_classifier can be used. Execution: `python -m unittest test.test_classifier``

Test_Facerecognizer

Operates on single image
Loads image, runs all pipeline steps
visualizes landmarks and bounding boxes
shows each single face (known and unknowns) with distance and original size information python3 -m unittest test.test_facerecognizer

Test_Classifier

Operates on video
executes full recognition stack on video
visualizes bounding boxes and detected names
can write to file python3 -m unittest test.test_classifier

API Examples

The flask server API takes simple REST calls I use Postman to debug the API and the application. API calls in openHAB via restcall.

Examples

New Streaming Job from CAM_URL for 10s on server SERVER_URL:PORT curl -i -H "Content-Type: application/json" -X POST -d {'"URL":"CAM_URL", "Time": "10"}' SERVER_URL:SERVER_PORT/detectJSON

List Statistics curl -i -H SERVER_URL:SERVER_PORT/getStats

Get i Known Person curl -i -H SERVER_URL:SERVER_PORT/getKnown/<int:index>

Get i Known Face curl -i -H SERVER_URL:SERVER_PORT/getKnownFace/<int:index>

Get i Unknown Face curl -i -H SERVER_URL:SERVER_PORT/getUnknownFace/<int:index>

Get last complete Frame curl -i -H SERVER_URL:SERVER_PORT/getLastFrame

rsohlot / sh_face_rec