DanThePutzer / locana

A machine learning project aimed at recognizing hand gestures built with neural networks built on Tensorflow.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

locana

locana (लोचन) - Sanskrit for "sight"

This project aims at building a system to recognize hand gestures in images and eventually a live camera feed. No pretrained models are being used. The current approach leverages convolutional neural networks implemented using Tensorflow. At the moment only using one data source is being used for training and testing, but I am planning to integrate several others in the near future to improve the real world performance of the model (more about the data below).

This is a work on progress and the final goal is fully bundled python package that can be used in other projects to make gesture recognition more accessible.

undraw_chat_bot_kli5

Installation & Usage

The current code only has a few things you need before you can get up and running. Depending on your environment you might want to use pip or conda to install all the necessary dependencies.

# Install dependencies with pip
pip install tensorflow scikit-learn pandas numpy tqdm pillow

# Install dependencies with conda
conda install -c conda-forge tensorflow scikit-learn pandas numpy tqdm pillow

You also need Jupyter to be able to open the notebook. Again depending on your environment, pick the proper command to install.

pip install juypter
# or
conda install jupyter

Now simply navigate to the root directory of the repo and run jupyter notebook in the command line (don't forget to activate your environment first if you are using conda).

Data Sources

As stated earlier, the project is currently based on only one dataset, the Hand Gesture Recognition Database. It contains images of 10 different gestures with 2000 images per gesture. The images are given in gray scale and all have the same resolution, making for an easy starting point to train a model, but also invites overfitting. A few example images are given below.

dataset-cover

Planned Progress

Given the very homogenous data and the overfitting that comes with it, I am planning on introducing several more data sources to diversify the images in quality, size, angle etc. That should help creating a model with a more robust performance in real world scenarios. Additionally the CNN structure sure could need some more research and tweaking and the whole thing should be able to recognize gestures in real time. This is very much a work in progress.

Upcoming ToDos:

  • Find & integrate new data sources
  • Improve data augmentation setup
  • Change/tweak CNN structure
  • Capture live webcam feed & run it through processing pipeline and model

 

Daniel Putzer, 2020
https://danielputzer.com

About

A machine learning project aimed at recognizing hand gestures built with neural networks built on Tensorflow.


Languages

Language:Jupyter Notebook 100.0%