Pic2Speech - Describing the world for visually impaired

Pic2Speech uses Artificial Intelligence to describe the content of pictures to help visually impaired understanding the world.

Model

The model is built with Keras and is mostly based on Show and Tell: A Neural Image Caption Generator" by Vinyals et al. It's trained on the Full MS COCO for around 500k steps.

The model is deployed on an Azure ML Service using Azure ML Python API.

The mobile app is developed with Google Flutter and let people take a picture with their smartphone and get a vocal description for it.

A Neural Captioning Model trained with Keras and deployed in a Flutter App

Language:Jupyter Notebook 99.3%Language:Java 0.6%Language:Dart 0.1%Language:Ruby 0.0%Language:Objective-C 0.0%Language:Makefile 0.0%