InfiniteRep: The perfect vision dataset for fitness and physical therapy applications
Overview
InfiniteRep is a synthetic, open-source dataset for fitness and physical therapy applications. It includes videos of diverse avatars performing multiple repetitions of common exercises. It includes significant variation in the environment, lighting conditions, avatar demographics, and movement trajectories. From cadence to kinematic trajectory, each rep is done slightly differently -- just like real humans. InfiniteRep videos are accompanied by a rich set of pixel-perfect labels and annotations, including frame-specific repetition counts.
The dataset features:
- 100 videos per exercise, spanning 5 to 10 repetitions each
- 7 unique indoor scenes
- Realistic environmental occlusion (+ corresponding labels)
- Diverse lighting conditions
- Varied body shape, skin tones, and clothing
- Rich annotations for 2D and 3D supervision
Exercises
The dataset currently includes the following exercises:
- Pushups
- Alternating Bicep Curls (with dumbbells)
- Delt Flys (with dumbbells)
- Squats
- Bird Dogs
- Supermans
- Bicycle Crunches
- Leg Raises
- Front Raises (with dumbbells)
- Overhead Press (with dumbbells)
Getting Started
Images and annotations can be downloaded on a per-exercise basis here. You can download just one exercise, pick and choose several that are most relevant, or go for all of them! We also provide a starter notebook that demonstrates how to filter the dataset based on metadata and visualize labels.
Annotations
Metadata and annotations for the dataset are packaged at the level of individual videos. Each video comes with the following files:
{video_number}.mp4
: 224x224 RGB video (24 fps).{video_number}_img_labels.zip
: Zip file of the corresponding semantic and instance segmentation masks (with and without occlusion).{video_number}.json
: Annotations for the video, structured in COCO format.
Scene-level annotations
Scene-level metadata for each video includes:
camera_pitch
: Pitch of the camera, in degrees. A value of 90 indicates the camera's line of sight is parallel to the ground plane.camera_height
: Height of the camera relative to the floor plane, in meters.avatar_presenting_gender
: Gender of the underlying SMPL-X body model.avatar_attire_top
/avatar_attire_bottom
: Clothing type used in the applied UV texture.avatar_betas
: 10 shape coefficients for the underlying SMPL-X body model.avatar_waist_circumference
: Circumference of the SMPL-X body model's waist, in meters.
Frame-level annotations
The number of repetitions in each video is provided as a continuous, per-frame value, called rep_count
. This value represents the number of repetitions completed since the beginning of the video PLUS a float in the range of [0,1] that indicates the current frame’s relative position in the repetition sequence. For example, a value of 4.23 indicates that 4 full repetitions have been completed since the beginning of the video, and that the current frame corresponds to 23% completion of the next one.
In addition to repetition count, InfiniteRep segments and provides frame-level annotations for both person
and dumbbell
categories. These include:
color
: Normalized RGB value in the corresponding instance segmentation maskspercent_in_fov
: Percentage of the vertices from the underlying mesh that are within the camera's field-of-view, regardless of occlusion status. This value can be used to disambiguate whether sparse instance segmentation masks reflect a high degree of environmental occlusion versus the instance being out-of-frame.percent_occlusion
: Percentage of the instance that is not visibile due to environmental occlusion (i.e. objects in the foreground). It is quantified as the relative difference between the occluded and unoccluded instance segmentation masks, which are also provided.bbox
: Bounding box in standard COCO formatsegmentation
: Polygon segmentation in standard COCO formatarea
: Area enclosed by polygon segmentationcuboid_coordinates
: Image coordinates of the surroinding 3D cuboid, with axes that are parallel to the global coordinate system. The order of the cuboid points is shown below.
3-------2
/| /|
/ | / |
0-------1 |
| 7----|--6
| / | /
4-------5
We also provide the following annotations for each person
instance:
armature_keypoints
: Image coordinates (x,y), visibility (v), and depth from camera (z, in meters) for each degree-of-freedom in the underlying SMPL-X model. Visibility values indicate whether keypoints are not in the image frame (0), in the image frame but occluded (1), or visibile (2).keypoints
: Image coordinates and visibility in standard COCO format for each keypoint in the 17-point COCO skeleton. Visibility labels are defined as inarmature_keypoints
. Note that the hip keypoints in this data structure correspond to different locations than those inarmature_keypoints
. Specifically, they correspond to a more lateral location designed to better reflect where human annotators typically place the hips (e.g. in the COCO dataset).num_keypoints
: Number of keypoints in the COCO skeleton with non-zero visibilityquaternions
: 3D rotations for each degree-of-freedom in the SMPL-X model, relative to its parent in the kinematic tree, in wxyz order
Segmentation annotatations
For each frame of a video, the following segmentation masks are provided:
image.{frame_number}.cseg.png
: Semantic segmentationimage.{frame_number}.iseg.png
: Instance segmentationimage.{frame_number}.iseg.{annotation_id}.png
: Instance segmentation without occlusion
Contact
InfiniteRep is brought to you by Infinity AI. We're a small team of dedicated engineers who specialize in generating custom synthetic datasets and built this dataset hoping it would be useful to people like you! Drop us a line at info@toinfinity.ai.
Terms and Conditions
This work is licensed under a Creative Commons Attribution 4.0 International License. Both academic and commercial applications are allowed.