SCANet: Real-Time_Face_Parsing_Using_Spatial_and_Channel_Attention

The official repository of "SCANet: Real-Time Face Parsing Using Spatial and Channel Attention", presented at the 2023 20th International Conference on Ubiquitous Robots (UR).

by Seung-eun Han, Ho-sub Yoon in UST(University of Science & Technology), ETRI(Electronics and Telecommunications Research Institute), Korea.

Abstract

This paper presents a real-time face parsing method that is efficient and robust to small facial components. The proposed approach utilizes two separate attention networks, namely the Spatial and Channel Attention Networks (SCANet), to integrate local features with global dependencies and focus on the most critical contextual features. Specifically, the Spatial attention module (SAM) captures the spatial relationships between different facial features, while the Channel attention module (CAM) identifies important features within each channel of the feature map, such as skin texture or eye color. Moreover, an edge detection branch, which helps differentiate edge and non-edge pixels, is added to improve segmentation precision along edges. To address class imbalance issues, which arise from limited data on accessories such as necklaces and earrings, we utilize a weighted cross-entropy loss function that assigns higher weights to rare classes. The proposed method outperforms state-of-the-art methods on the CelebAMask-HQ dataset, especially in small facial classes like necklaces and earrings. Additionally, the model is designed to operate in real-time, making it a promising solution for various face recognition and analysis applications.

✨ Demo ✨

Face_Parsing_Demo

✨ Application ✨

We have developed a Face Parsing network that operates in real-time on desktop and Mobile devices.

On DeskTop 🖥

from WebCam

🤩 Make Up Simulation 🤩

🆕 Face Reconstruction 🆕

Face Reconstruction

Visualization

The comparison of our face parsing results to those of the previous state-of-the-art model, DML CSR. (a) is the original image, (b) is the ground-truth of the corresponding image, (c) is the result of DML CSR, and (d) is result of our models. Our approach effectively enables more detailed segmentation of rare facial components, particularly in the case of necklaces.

Dataset

CelebAMask-HQ

You can download this dataset at here.

And make that folder construct as below:

./CelebAMask
    |---test
    |---train
        |---images
            |---00000.jpg
            |---00001.jpg
        |---labels
            |---00000.png
            |---00001.png
        |---edges
            |---00000.png
            |---00001.png
    |---valid
    |---label_names.txt
    |---test_list.txt
    |---train_list.txt
        |---'images/00000.jpg labels/00000.png'
        |---'images/00001.jpg labels/00001.png'
    |---valid_list.txt

You can make train/valid/test_list.txt file through this code.

Pre-Processing

1. Make Label Images

You can make labels of CelebAMask-HQ Dataset through this code. And you have to change below "Path".

IMAGE_PATH = '$Your Data path$/CelebAMask-HQ/CelebA-HQ-img/'
ANNOTATIOM_PATH = '$Your Data path$/CelebAMask-HQ/CelebAMask-HQ-mask-anno_acc'
SAVE_PATH = "$The path where you want to save$"
INPUT_SIZE = $input size$

2. Make Edge Images

You can make edges of CelebAMask-HQ Dataset through this code. And you have to change below "Path".

generate_edge("$Your Label Path$", "The path where you want to save")

And the other extra pre-processing codes are here.

Example

The labels and edge images seen here have been multiplied by 10 and 255, respectively, for easier viewing.

Result

We've got the state-of-the-art methods on the CelebAMask-HQ dataset, especially in small facial classes like necklaces and earrings.

Citation

@INPROCEEDINGS{10202537,
  author={Han, Seungeun and Yoon, Hosub},
  booktitle={2023 20th International Conference on Ubiquitous Robots (UR)}, 
  title={SCANet: Real-Time Face Parsing Using Spatial and Channel Attention}, 
  year={2023},
  volume={},
  number={},
  pages={13-18},
  doi={10.1109/UR57808.2023.10202537}}

Seungeun-Han / SCANet_Real-Time_Face_Parsing_Using_Spatial_and_Channel_Attention