High-resolution networks (HRNets) for facial landmark detection

Introduction

This is the official code of High-Resolution Representations for Facial Landmark Detection. We extend the high-resolution representation (HRNet) [1] by augmenting the high-resolution representation by aggregating the (upsampled) representations from all the parallel convolutions, leading to stronger representations. The output representations are fed into classifier. We evaluate our methods on four datasets, COFW, AFLW, WFLW and 300W.

Performance

ImageNet pretrained models

HRNetV2 ImageNet pretrained models are now available! Codes and pretrained models are in HRNets for Image Classification

We adopt HRNetV2-W18(#Params=9.3M, GFLOPs=4.3G) for facial landmark detection on COFW, AFLW, WFLW and 300W.

COFW

The model is trained on COFW train and evaluated on COFW test.

Model	NME	FR_0.1	pretrained model	model
HRNetV2-W18	3.45	0.20	HRNetV2-W18	HR18-COFW.pth

AFLW

The model is trained on AFLW train and evaluated on AFLW full and frontal.

Model	NME_full	NME_frontal	pretrained model	model
HRNetV2-W18	1.57	1.46	HRNetV2-W18	HR18-AFLW.pth

WFLW

NME	test	pose	illumination	occlution	blur	makeup	expression	pretrained model	model
HRNetV2-W18	4.60	7.86	4.57	5.42	5.36	4.26	4.78	HRNetV2-W18	HR18-WFLW.pth

300W

NME	common	challenge	full	test	pretrained model	model
HRNetV2-W18	2.91	5.11	3.34	3.85	HRNetV2-W18	HR18-300W.pth

Quick start

Environment

This code is developed using on Python 3.6 and PyTorch 1.0.0 on Ubuntu 16.04 with NVIDIA GPUs. Training and testing are performed using 1 NVIDIA P40 GPU with CUDA 9.0 and cuDNN 7.0. Other platforms or GPUs are not fully tested.

Install

Install PyTorch 1.0 following the official instructions
Install dependencies

pip install -r requirements.txt

Clone the project

git clone https://github.com/HRNet/HRNet-Facial-Landmark-Detection.git

HRNetV2 pretrained models

cd HRNet-Facial-Landmark-Detection
# Download pretrained models into this folder
mkdir hrnetv2_pretrained

Data

You need to download the annotations files which have been processed from OneDrive.
You need to download images (300W, AFLW, WFLW) from official websites and then put them into images folder for each dataset.

Your data directory should look like this:

HRNet-Facial-Landmark-Detection
-- lib
-- experiments
-- tools
-- data
   |-- 300w
   |   |-- face_landmarks_300w_test.csv
   |   |-- face_landmarks_300w_train.csv
   |   |-- face_landmarks_300w_valid.csv
   |   |-- face_landmarks_300w_valid_challenge.csv
   |   |-- face_landmarks_300w_valid_common.csv
   |   |-- images
   |-- aflw
   |   |-- face_landmarks_aflw_test.csv
   |   |-- face_landmarks_aflw_test_frontal.csv
   |   |-- face_landmarks_aflw_train.csv
   |   |-- images
   |-- cofw
   |   |-- COFW_test_color.mat
   |   |-- COFW_train_color.mat  
   |-- wflw
   |   |-- face_landmarks_wflw_test.csv
   |   |-- face_landmarks_wflw_test_blur.csv
   |   |-- face_landmarks_wflw_test_expression.csv
   |   |-- face_landmarks_wflw_test_illumination.csv
   |   |-- face_landmarks_wflw_test_largepose.csv
   |   |-- face_landmarks_wflw_test_makeup.csv
   |   |-- face_landmarks_wflw_test_occlusion.csv
   |   |-- face_landmarks_wflw_train.csv
   |   |-- images

Train

Please specify the configuration file in experiments (learning rate should be adjusted when the number of GPUs is changed).

python tools/train.py --cfg <CONFIG-FILE>
# example:
python tools/train.py --cfg experiments/wflw/face_alignment_wflw_hrnet_w18.yaml

Test

python tools/test.py --cfg <CONFIG-FILE> --model-file <MODEL WEIGHT> 
# example:
python tools/test.py --cfg experiments/wflw/face_alignment_wflw_hrnet_w18.yaml --model-file HR18-WFLW.pth

Other applications of HRNets (codes and models):

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{SunZJCXLMWLW19,
  title={High-Resolution Representations for Labeling Pixels and Regions},
  author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao 
  and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
  journal   = {CoRR},
  volume    = {abs/1904.04514},
  year={2019}
}

Reference

[1] Deep High-Resolution Representation Learning for Human Pose Estimation. Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. CVPR 2019. download

xindongzhang / HRNet-Facial-Landmark-Detection