Understanding MediaPipe's Face Mesh Output

This repository is intended to serve as a collection of resources for understanding the output of MediaPipe's Face Mesh models.

Background

MediaPipe Face Mesh is a solution that estimates the position of face landmarks for given input images. It returns a list of canonical length and order contained one 3D coordinate for each face landmark. While each coordinate at the same index in the returned list always represents the same landmark the official documentation is very sparse on what each landmark encodes.

Landmarks visualized

The following images illustrate the semantic of each coordinate index, by (1) showing detected face landmarks drawn on-top of a reference face image, (2) showing the same face landmarks without the reference face image, and (3) showing face landmarks positioned according to the UV coordinates of a texture for the face mesh.

Each image is available both with labeled and unlabeled landmark indices.

Additionally, we also show the detected landmarks both for the regular Face Mesh model and also the refined one that is available as faceMesh.setOptions({ refineLandmarks: true }); in JavaScript. The refined model shares the meaning of the first 468 landmark indices with the regular model (although their position is usually estimated to be at slightly different coordinates), but also adds 10 additional landmarks about left and right eye irises.

The code to generate these images is available in index.html and the accompanying JavaScript files.

Name	Unnumbered	Numbered
reference face image
landmarks with face
refined landmarks with face
landmarks
refined landmarks
uv coordinates

Sadly, no UV coordinates seem to be available for the iris landmarks of the refined model. Therefore, there is no UV coordinates visualization available for the refined model.

Raw output

The following JSON files represent the raw output of the Face Mesh module:

landmarks.json
refined-landmarks.json
geometry.json (estimated via the regular/unrefined model.)

Sadly, the refined model does not seem to be able to estimate face geometry.

Further resources

The MediaPipe project provides an official UV coordinate visualization that is similar ours. However, the official one is of low resolution and the numbers of landmark indices are hard to read. Note that the official one uses a tesselation different to ours.
The MediaPipe project provides canonical 3D face models in FBX and OBJ format. However, these 3D models seem to be outdated as the order of vertices does not match the returned facial landmarks by current face mesh models. These models use the same tesselation as the official UV coordinate visualization that differs to ours.
The tesselation used by us is the one available in the FACEMESH_TESSELATION constant of @mediapipe/face_mesh. Sadly, this package only seems to be available in minified form to the public. The same tesselation is used in the current face mesh Python package.
The UV coordinates used by us are from uv_coords.ts of the tfjs-models repository. Note that this is a link to an older commit. Sadly, no current official repository seems to provide these UV coordinates.
The most in-depth programmatic description of landmark indices is available in keypoints.ts of the tfjs-models repository. Note that as in the point above this links to an older commit of a since-removed file.

lschmelzeisen / understanding-mediapipe-facemesh-output

Understanding MediaPipe's Face Mesh Output

Background

Landmarks visualized

Raw output

Further resources

About

Languages