american-sign-language sentence-generation sign-language-recognition sign-language-translation

WLASL-Recognition-and-Translation

This repository contains the "WLASL Recognition and Translation", employing the WLASL dataset descriped in "Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison" by Dongxu Li.

The project uses Cuda and pytorch, hence a system with NVIDIA graphics is required. Also, to run the system a minimum of 4-5 Gb of dedicated GPU Memory is needed.

Download Dataset

The dataset used in this project is the "WLASL" dataset and it can be found here on Kaggle

Download the dataset and place it in data/ (in the same path as WLASL directory)

Steps to Run

To run the project follow the steps

Clone the repo


git clone https://github.com/alanjeremiah/WLASL-Recognition-and-Translation.git

Install the packages mentioned in the requirements.txt file

Note: Need to install the correct compatible version of the cudatoolkit with pytorch. The compatible version with the command line can be found here. Below is the CLI used in this project


conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Open the WLASL/I3D folder and unzip the NLP folder in that path
Open the run.py file to run the application


python run.py

Model

This repo uses the I3D model. To train the model, view the original "WLASL" repo here

NLP

The NLP models used in this project are the KeyToText and the NGram model.

The KeyToText was built over T5 model by Gagan, the repo can be found here

Demo

The end results of the project looks like this.

The conversion of Sign language to Spoken Language.

Test.mp4

About

World Level American Sign Language Recognition and Translation to Spoken Language

american-sign-language sentence-generation sign-language-recognition sign-language-translation

Languages

Language:Python 100.0%