jianwang-mpi / awesome-egocentric-pose

A list of egocentric (first-person) motion capture works and related resources

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome Egocentric Human Pose

A list of the awesome egocentric human body pose estimation works and related resources. While some repositories awesome-egocentric-vision compile studies across the wide field of egocentric vision, none specifically focus on the niche area of egocentric human body pose estimation.

Contents

We split this topic by different capture setups:

Egocentric Inside-In Pose Estimation

The inside-in vision setup involves cameras or sensors directed toward the person or object of interest, capturing data from the inside of the motion capture subject. This setup can be seen on the Oculus Quest2 and Apple Vision Pro.

Training Datasets (bold means recommended to use)

Setup Dataset Number of Frames Synthetic or Real Actor Number Scene Annotation FPS Link
Monocular Fisheye Mo2Cap2[2019-2] 530K Synthetic - No - Link
xR-egopose[2019-3] 252K Train + 16 Val Synthetic 34 No 30 Link
EgoPW[2022-1] 318K Real (pseudo gt) 10 No 25 Link
EgoPW-Scene[2023-1] 92K Real (pseudo gt) 10 Pseudo Annotations 25 Link
EgoWholeBody[2023-5] 700K Synthetic 14 No 30 -
Stereo Perspecive
EgoGlass[2021-3] 172K Real 10 No 30 -
Stereo Fisheye
UnrealEgo[2022-2] 450K * 2 views Synthetic 17 No 25 Link
UnrealEgo2[2024-2] 1.25M * 2 views Synthetic 17 Yes 25 -

Evaluation Datasets (bold means recommended to use)

Setup Dataset Number of Frames Synthetic or Real Scene Annotation FPS Dataset Link Leader Board
Monocular Fisheye Mo2Cap2[2019-2] 5K Real No 25 Link -
xR-egopose[2019-3] 115K Synthetic No 30 Link -
GlobalEgoMocap[2021-2] 318K Real No 25 Link Paper With Code
SceneEgo[2023-1] 28K Real Yes 25 Link Paper With Code
EgoWholeBody[2023-5] 133K Synthetic No 30 - -
Stereo Fisheye
UnrealEgo[2022-2] 48K * 2 views Synthetic No 25 Link Paper With Code
UnrealEgo2[2024-2] 123K * 2 views Synthetic Yes 25 - -
UnrealEgo2-RW[2024-2] 130K * 2 views Real Yes 25 - -

Papers

2019 and Before

  1. Rhodin, Helge, et al. "Egocap: egocentric marker-less motion capture with two fisheye cameras." ACM Transactions on Graphics (TOG) 35.6 (2016): 1-11. [project page]
  2. Xu, Weipeng, et al. "Mo2cap2: Real-time mobile 3d motion capture with a cap-mounted fisheye camera." IEEE transactions on visualization and computer graphics 25.5 (2019): 2093-2101. [project page]
  3. Tome, Denis, et al. "xr-egopose: Egocentric 3d human pose from an hmd camera." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. [dataset]

2021

  1. Zhang, Yahui, Shaodi You, and Theo Gevers. "Automatic calibration of the fisheye camera for egocentric 3d human pose estimation from a single image." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021.
  2. Wang, Jian, et al. "Estimating egocentric 3d human pose in global space." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. [project page] [dataset] [demo]
  3. Zhao, Dongxu, et al. "Egoglass: Egocentric-view human pose estimation from an eyeglass frame." 2021 International Conference on 3D Vision (3DV). IEEE, 2021.

2022

  1. Wang, Jian, et al. "Estimating egocentric 3d human pose in the wild with external weak supervision." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. [project page] [dataset] [demo]
  2. Akada, Hiroyasu, et al. "UnrealEgo: A new dataset for robust egocentric 3d human motion capture." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022. [project page] [code] [dataset] [demo]
  3. Park, Jinman, et al. "Building Spatio-temporal Transformers for Egocentric 3D Pose Estimation." arXiv preprint arXiv:2206.04785 (2022).
  4. Liu, Yuxuan, et al. "Ego+ X: An Egocentric Vision System for Global 3D Human Pose Estimation and Social Interaction Characterization." 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022.

2023

  1. Wang, Jian, et al. "Scene-aware Egocentric 3D Human Pose Estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. [project page] [dataset] [code]
  2. Liu, Yuxuan, et al. "EgoHMR: Egocentric Human Mesh Recovery via Hierarchical Latent Diffusion Model." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.
  3. Liu, Yuxuan, et al. "EgoFish3D: Egocentric 3D Pose Estimation from a Fisheye Camera via Self-Supervised Learning." IEEE Transactions on Multimedia (2023).
  4. Kang, Taeho, et al. "Ego3DPose: Capturing 3D Cues from Binocular Egocentric Views." SIGGRAPH Asia 2023 Conference Papers. 2023.
  5. Wang, Jian, et al. "Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement." arXiv preprint arXiv:2311.16495 (2023).

2024

  1. Cuevas-Velasquez, Hanz, et al. "SimpleEgo: Predicting Probabilistic Body Pose from Egocentric Cameras." arXiv preprint arXiv:2401.14785 (2024).
  2. Akada, Hiroyasu, et al. "3D Human Pose Perception from Egocentric Stereo Videos." arXiv preprint arXiv:2401.00889 (2024).

Egocentric Inside-Out Pose Estimation

The inside-out vision setup employs cameras or sensors positioned on the person or device, looking outward to the environment. This approach is commonly used in most virtual reality (VR) headsets and augmented reality (AR) systems, where cameras attached to the headset capture the user's surroundings and interpret motion relative to them.

Datasets

Papers

IMU-Based Egocentric Pose Estimation

The Inertial Measurement Unit (IMU) setup utilizes sensors typically composed of accelerometers, gyroscopes, and sometimes magnetometers. In egocentric motion capture, IMUs are placed on the human body to capture dynamic motion and limb orientation changes.

Datasets

Papers

Headset-Based Egocentric Pose Estimation

Some methods use the headset 6dof pose (head pose) and VR controller 6dof pose (hand pose) to estimate full body pose. The hand and head poses come from the headset SLAM and VR controller, the input signal is much less noisy than the IMU setup.

Datasets

Papers

Third-Person View Egocentric Pose Estimation

The third-person setup refers to motion capture techniques that involve a third person carrying moving cameras observing the motion capture subject.

Datasets

Papers

Mixed Setup

Combination of aforementioned setups.

About

A list of egocentric (first-person) motion capture works and related resources