Monocular Total Capture

Code for CVPR19 paper "Monocular Total Capture: Posing Face, Body and Hands in the Wild"

Project website: [http://domedb.perception.cs.cmu.edu/mtc.html]

Dependencies

This code is tested on a Ubuntu 16.04 machine with a GTX 1080Ti GPU, with the following dependencies.

ffmpeg
Python 3.5 (with TensorFlow 1.5.0, OpenCV, Matplotlib)
cmake >= 2.8
OpenCV 2.4.13 (compiled with CUDA)
Ceres-Solver 1.13.0 (with SuiteSparse)
OpenGL, GLUT, GLEW
libigl https://github.com/libigl/libigl
wget
OpenPose

Installation

git clone this repository; suppose the main directory is ${ROOT} on your local machine.
"cd ${ROOT}"
"bash download.sh"
git clone OpenPose https://github.com/CMU-Perceptual-Computing-Lab/openpose and compile. Suppose the main directory of OpenPose is ${openposeDir}, such that the compiled binary is at ${openposeDir}/build/examples/openpose/openpose.bin
Edit ${ROOT}/run_pipeline.sh: set line 13 to you ${openposeDir}
Edit ${ROOT}/FitAdam/CMakeLists.txt: set line 13 to the "include" directory of libigl (this is a header only library)
"cd ${ROOT}/FitAdam/ && mkdir build && cd build"
"cmake .."
"make -j12"

Usage

Suppose the video to be tested is named "${seqName}.mp4". Place it in "${ROOT}/${seqName}/${seqName}.mp4".
If the camera intrinsics is known, put it in "${ROOT}/${seqName}/calib.json" (refer to "POF/calib.json" for example); otherwise, a default camera intrinsics will be used.
In ${ROOT}, run "bash run_pipeline.sh ${seqName}"; if the subject in the video shows only upper body, run "bash run_pipeline.sh ${seqName} -f".

Examples

"download.sh" automatically download 2 example videos to test. After successful installation run

bash run_pipeline.sh example_dance

bash run_pipeline.sh example_speech -f

License and Citation

This code can only be used for non-commercial research purposes. If you use this code in your research, please cite the following papers.

@inproceedings{xiang2019monocular,
  title={Monocular total capture: Posing face, body, and hands in the wild},
  author={Xiang, Donglai and Joo, Hanbyul and Sheikh, Yaser},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

@inproceedings{joo2018total,
  title={Total capture: A 3d deformation model for tracking faces, hands, and bodies},
  author={Joo, Hanbyul and Simon, Tomas and Sheikh, Yaser},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Some part of this code is modified from lmb-freiburg/hand3d.

Adam Model

We use the deformable human model Adam in this code.

The relationship between Adam and SMPL: The body part of Adam is derived from SMPL model. It follows SMPL's body joint hierarchy, but uses a different joint regressor. Adam does not contain the original SMPL model's shape and pose blendshapes, but uses its own version trained from Panoptic Studio database.

Facial expression of Adam model is unavailable due to copyright issues.

JaredYeDH / MonocularTotalCapture