sight.ai

An AI visual assistant made for visually impaired person to navigate around the city.

Inspiration

In 2015, David Eagleman presented his remarkable research on TED by showing us that we can create new senses for ourselves. This can be done by taking an input data and translating them to vibrational mappings on the sensory vest. Then, a person can "feel" the input data through the vibration patterns on the sensory vest and learn from it. During the TED presentation, a deaf person was able to "hear" the words uttered to him via the sensory vest, and correctly write those words.

This project aims to do the same on vision by creating a Vision Encoding System to aid a visually impaired person to "see" what is in front of him/her. This system will encode the image (seen from the perspective of a person) into depth map, which could help gauge distances between obstacles and people to the person seeing them. Then, the depth map is translated to vibrational mappings on the sensory vest to allow the visually impaired person to "feel" what is in front.

Architecture of Vision Encoding System

Image -> Infer Depth Map
Define 36 (arbitrary) grid area to represent each vibrational motor on the sensory vest.
Depth Map -> Calculate mean on Depth Map (for each grid area) to produce Vibrational Mappings.

Getting Started

Create conda environment (GPU)

conda env create --file environment.yml
conda activate sightai
cd src

Create conda environment for CPU (not fully supported)

conda env create --file cpu_environment.yml
conda activate sightai_cpu
pip install torch==1.5.1+cpu torchvision==0.6.1+cpu -f https://download.pytorch.org/whl/torch_stable.html

Download Model Weights

Download weights for BTS and YOLOv4 (see Credits). Place the weights in:

./pretrained/bts_latest
./pretrained/yolov4.weights

Example scripts:

Run demo on sample image

python run_demo_image.py -input media/165_R.png -plot 1 -cuda 1

Output can be found at src/output/_165_R.png

Run demo on sample video

python run_demo_video.py -input media/two_way.mp4 -fps 30.0 -max 20 -cuda 1

Output can be found at src/output_video/out_two_way.avi. Currently limited to first 20 frames.

junyitt / sightai

sight.ai

Inspiration

Architecture of Vision Encoding System

Getting Started

Create conda environment (GPU)

Create conda environment for CPU (not fully supported)

Download Model Weights

Example scripts:

Run demo on sample image

Run demo on sample video

Credits

YOLOv4 pytorch implementation by Tianxiaomo

BTS (State of the Art Monocular Depth Estimation)

About

Languages