Our model helps us to detect the gestures of participants in a meeting room, suddenly, we try to detect the voting
, speaking
gestures of a person
participant.
This model of gesture detection is inspired by the project Single Shot MultiBox Detector Implementation in Pytorch
We base our gesture detection work on the VGG16-SSD model and by transfer learning on the panoramic image data of LINAGORA
- Python 3.6+
- Git, Wget
- OpenCV
- Pytorch 1.0 or Pytorch 0.4+
- Pip3 or Pip
- Numpy
- Pandas
- Clone from github the Pytorch-SSD repository
git clone https://github.com/qfgaohao/pytorch-ssd
cd pytorch-ssd
- Install requirements
pip3 install -r models/requirements.txt
- Download dataset ( panoramic images and videos )
git clone https://github.com/linto-ai/panoramic-dataset-for-gestures-detection
mv panoramic-dataset-for-gestures-detection data
- Download Vgg16-SSD basic model
wget -P models https://storage.googleapis.com/models-hao/vgg16-ssd-mp-0_7726.pth
- Download model and the label file ( voc-model-labels.txt )
git clone https://github.com/linto-ai/gestures-detection-model
cp gestures-detection-model/voc-model-labels.txt models/
- Train VGG16-SSD model with transfer learning for 100 epochs
python3 train_ssd.py --dataset_type voc --datasets data --net vgg16-ssd --pretrained_ssd models/vgg16-ssd-mp-0_7726.pth --scheduler cosine --lr 0.01 --t_max 100 --validation_epochs 1 --num_epochs 100 --base_net_lr 0.001 --batch_size 5
Output:
- Change the model path
cp -r gestures-detection-model/vgg16-ssd-linagora-gest-detection.pth models/
- Run image demo
python3 run_ssd_example.py vgg16-ssd models/vgg16-ssd-linagora-gest-detection.pth models/voc-model-labels.txt test/img1.jpg
python3 run_ssd_live_demo.py vgg16-ssd models/vgg16-ssd-linagora-gest-detection.pth models/voc-model-labels.txt test/vid1.avi