assistive-robotics egocentric-vision motion-capture predictive-modeling wearable-sensors multimodal-deep-learning computer-vision optical-flow reccurent-neural-network representation-learning convolutional-neural-network

Visual Control

Using egocentric vision to improve the imu-based prediction of future knee and ankle joint angles, in complex out-of-the-lab environments.

Paper: https://ieeexplore.ieee.org/abstract/document/9729197

Summary

Here we fuse motion capture data with egocentric videos to improve the joint angle prediction performance in complex uncontrolled environment like public classrooms, atrium and stairwells. The optical flow features are generated from the raw images, by PWC-net trained on the synthetic MPI-Sintel dataset, and processed by a LSTM before being fused with the joint kinematics stream.

In the following video, we can see that the information about the future movements of the subject is available in their visual field, both in terms of what lies ahead of them e.g. stairs or chairs, as well as how they move their head and eyes for path-planning. Thus, vision acts as a "window into the future".

a_window_into_the_future.mp4

Egocentric vision improves the prediction of lower limb joint angles

The following videos (frames dropped to shorten the videos) and the corresponding figures show example maneuvers and the improvement achieved over just kinematics inputs (red line), by fusing kinematics and vision inputs (green line).

go_around_the_podium.mp4

enter_the_classroom.mp4

The benefits of egocentric vision can be amplified with more data

In the figure below, we compared performance improvements due to vision with increase in the amount of data per subject (left) and increase in the number of subjects (right). We see that inclusion of vision shows better improvement than no vision condition, for both the cases. We also see that rate of improvement for vision reduces slowly compared to the no vision condition. Indicating that with more data better performance could be achieved.

Dataset

The dataset used in the paper can be accessed through the following repository: https://github.com/abs711/The-way-of-the-future . The detailed description of the dataset is available in the following publication: https://doi.org/10.1038/s41597-023-01932-7

Training the models

Run 'torchVision/MainPain/main.py' to start training a model. The models used in the paper are defined 'UtilX/Vision4Prosthetics_modules.py'.

About

Deep Learning models to fuse imu-based motion capture and first-person video data to improve the prediction of future knee and ankle joint kinematics, in complex real-world environments.

assistive-robotics egocentric-vision motion-capture predictive-modeling wearable-sensors multimodal-deep-learning computer-vision optical-flow reccurent-neural-network representation-learning convolutional-neural-network

MIT License

Languages

Language:Python 100.0%