computer-vision fitness python opencv tensorflow pose estimation human model webrtc camera keras recognition classification artificial-intelligence neural-network data-science flask

This repo is NOT actively maintained and may not work out of the box as it has been 3 years since the last update. If you want to learn more about the next version of this project, check it out here: https://www.youtube.com/watch?v=tZcRYcjTwWA.

Phormatics: Using AI to Maximize Your Workout

^{f1: front page (the gif may be choppy at first, but it's worth it I promise)}

by: Jason Chin , Charlie Lin , Brad Huang , Calvin Woo

HackNYU2018 project developed in 36 hours, focusing on using A.I. and computer vision to build a virtual personal fitness trainer. Capable of using 2D human pose estimation with commodity web-cameras to critique your form and count your repetitions.

This project won the award for "The Most Startup-Viable Hack" as awarded by Contrary Capital.

2D Human Pose Estimation:

^{f2: live pose estimation in a busy environment; note: here the user has over-extended their right arm (image is mirrored), which is considered bad form in this variant of the dumb bell shulder press, hence the message.}

The pose estimation was based off of tf-pose-estimation by ildoonet. The model architecture, OpenPose developed by CMU Perceptual Computing Lab, consists of a deep convolutional neural network for feature extraction (MobileNet) and a two-branch multi-stage CNN for confidence maps and Part Affinity Fields (PAFs).

This feature allowed us to track the position of the user's joints using a commodity webcam.

Data Flow (Web Based):

^{f3: pseudo data flow diagram; note: the pose estimation model output must be processed as it returns pose estimation for all possible humans in frame (see: Future Changes ^[1]).}

This app runs in browser and the pose estimation and form critique generation is performed on a Flask server. The webcam feed is captured using WebRTC and screenshots are sent to the server as a base64 encoded string every 50ms or as fast as the server can respond - which ever is slower (see: Future Changes ^[2]).

This means the server could be run in the cloud on high-performance hardware and the client could be any device with a WebRTC-supported web browser and camera. There is also the option for video to be recorded and sent to the server for post-processing if the user's network connectivity is too slow to stream a live feed.

Currently Supported Exercise Analysis:

Squat: exaggerated knees-forward checking
Dumbbell Shoulder Press: exaggerated arm bend and extension checking
Bicep Curls: horizontal elbow deviation from shoulder checking

Future Changes:

Multiple Pose Estimations for One User

Current: The model estimates joints for all subjects found in the input image; we then analyze the output and extract the pose that is most likely to be the user.

Possible Improvements:

a. Modify model and training data to only estimate a single 'best' pose.

or

b. Implement re-identification and support multiple users at once. This is viable as forward propagation time does not increase with multiple poses being estimated.
Webcam Image Data Transfer

Current: Webcam captures are encoded in base64 strings and a post request is sent to the server with the data (note: this was done for ease of implementation due to the hackathon time constraint).

Possible Improvements: Implement web sockets to transfer webcam captures instead.

About

Using A.I. and computer vision to build a virtual personal fitness trainer. (Most Startup-Viable Hack - HackNYU2018)

computer-vision fitness python opencv tensorflow pose estimation human model webrtc camera keras recognition classification artificial-intelligence neural-network data-science flask

Languages

Language:Jupyter Notebook 45.6%Language:Python 40.8%Language:JavaScript 9.0%Language:CSS 4.6%