leizi007 / JDA

C++ implementation of Joint Cascade Face Detection and Alignment.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JDA

C++ implementation of Joint Cascade Face Detection and Alignment.

Fetech Code

I recommend using Git to fetch the source code. If you are not familiar with Git, there is a tutorial you can follow.

$ git clone --recursive https://github.com/luoyetx/JDA.git

OR

$ git clone https://github.com/luoyetx/JDA.git
$ cd JDA
$ git submodule update --init

If you directly download the zip file, please remember to download luoyetx/liblinear and luoyetx/jsmnpp, then extract the source code to 3rdparty. liblinear is used for global regression training and jsmnpp is used for json config parsing.

Build

We use CMake to build the project, I highly recommend you to use this build tool. We also need OpenCV. If you are on Windows, make sure you have set environment variable OpenCV_DIR to OpenCV's build directory like D:/3rdparty/opencv2.4.11/build. You may also need Visual Studio to compile the source code. If you are on Linux or Unix, install the development packages of OpenCV via your system's Package Manager like apt-get on Ubuntu or yum on CentOS. However, Compile the source code of OpenCV will be the best choice of all.

$ cd JDA
$ mkdir build && cd build
$ cmake ..
$ make

Config

We use config.json for configuration. config.template.json is a template, please copy one and rename it to config.json. Attention, all relative path is start from build directory, and please use / instead of \\ even if you are on Windows platform.

Data

You should prepare your own data. You need two kinds of data, face with landmarks and background images. You also need to create a text file face.txt and some background.txt text files which can be changed in config.json. Every line of face.txt indicates a face image's path with its landmarks and face bounding box, all points are aligned to the left top of the image. The number of landmarks can be changed in config.json and the order of landmarks does not matter.

../data/face/00001.jpg bbox_x bbox_y bbox_w bbox_h x1 y1 x2 y2 ........
../data/face/00002.jpg bbox_x bbox_y bbox_w bbox_h x1 y1 x2 y2 ........
....
....

bbox in face.txt indicate the face region. You can turn on data augment which will flip the face, but you also need to give symmetric landmarks index for flip operation. If bbox is out of range of the original image, the program will fill the rest region with black.

background.txt is much more simpler. Every line indicates where the background image in the file system.

../data/bg/000001.jpg
../data/bg/000002.jpg
../data/bg/000003.jpg
....
....

Background images should have no face and we will do data augment during the hard negative mining. Of course, you can use absolute path to indicate where is your face images and background images. However, don't use any space character in your image path or non ASCII characters.

After loading the face images, the code will snapshot a binary data under data/dump with file name like jda_data_%s_stage_1_cart_1080.data, you can copy the data file to data/jda_train_data.data. Next time you start the training, it will load data directly from this binary data file.

Optional Init negative samples

It's a good idea to prepare the initial negative samples by yourself rather than scan from the background images. You can turn on the optional hard negative in config.json and provide a text file like background.txt, every line indicts a negative patch and will be loaded and resized. The initial negative samples will also be snapshotted to a binary file data/dump/hard.data. The config config.data.background[0] should be hard.txt or hard.data even if you turn off use_hard.

UPDATE I have shared the data I have collected. For more details, see this issue.

Train

$ ./jda train

If you are using Visual Studio, make sure you know how to pass command line arguments to the program. All trained model file will be saved to model directory.

Model Layout

All model file is saved as a binary file. The model parameters have two data type, 4 byte int and 8 byte double, please pay attention to the endianness of you CPU.

|-- mask (int)
|-- meta
|    |-- T (int)
|    |-- K (int)
|    |-- landmark_n (int)
|    |-- tree_depth (int)
|    |-- current_stage_idx (int) // training status
|    |-- current_cart_idx (int)
|-- mean_shape (double, size = 2*landmark_n)
|-- stages
|    |-- stage_1
|    |    |-- cart_1
|    |    |-- cart_2
|    |    |-- ...
|    |    |-- cart_K
|    |    |-- global regression weight
|    |-- stage_2
|    |-- ...
|    |-- stage_T
|-- mask (int)

For more details of the model file layout, please refer to cascador.cpp and cart.cpp.

FDDB Benchmark

FDDB is widely used for face detection evaluation, download the data and extract to data directory.

|-- data
|    |-- fddb
|         |-- images
|         |    |-- 2002
|         |    |-- 2003
|         |-- FDDB-folds
|         |    |-- FDDB-fold-01.txt
|         |    |-- FDDB-fold-01-ellipseList.txt
|         |    |-- ....
|         |-- result

You should prepare fddb data and model file. All result text file used by npinto/fddb-evaluation is under result directory.

$ ./jda fddb

Attention

Welcome any bug report and any question or idea through the issues.

QQ Group

There is a QQ group 347185749. If you are a Tencent QQ user, welcome to join this group and we can discuss more there.

License

BSD 3-Clause

References

About

C++ implementation of Joint Cascade Face Detection and Alignment.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:C++ 86.9%Language:C 12.0%Language:CMake 1.2%