zhengtianyu1996 / PostureRecognition

Posture Recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PostureRecognition

This program uses Nuitrack SDK library to detect human body joints. Then input these data into a BiLSTM network to make posture prediction.



0 Environment

1 Real Sense

All instructions are based on Real Sense 2.19.0 using Depth Camera D435.

1-1 Install

  • Intel.RealSense.SDK
    • Installer with Intel RealSense Viewer and Quality Tool, C/C++ Developer Package, Python 2.7/3.6 Developer Package, .NET Developer Package and so on.
    • Version: 2.19.0

1-2 Extra Information

2 Nuitrack

All instructions are based on NUITRACK 1.4.0

2-1 Install

2-2 Documents

3 Software Framework

3-1 Interface-1: Grab

interface-1

1. Show Image

Display the RGB image, and the skeleton data by red square dots.

2. Show Label

Display the judged gesture: Standing, Sitting, Walking, StandUp, SitDown, Falling.

3. Log Info

Log some important information during running.

4. Time Info

Log some time information like current processing time, average processing time and so on.

5. Grab

Start or Stop the camera grab.

6. Auto

Enabled or Disabled recognizing the posture automatically.

7. Write

Enabled or Disabled writing skeleton data to local disk.

8. Skeleton Data

Display the skeleton data, 25 points (XYZ, 75 float data) per frame.

9. Cam Info

  • FPS: Frame per second, also timer grab interval equals 1000 FPS.
  • W: The width of image, read only.
  • H: The height of image, read only.

10. Load

Load a .pb model.

11. Test

Open the file dialog and choose a sample. Then make the prediction using the loaded model.

12. Tap Option

Click different tap option to switch between Grab and Label. Grab is used for grab videos. Label is used for making labels.

3-2 Interface-2: Label

interface-2

13. Load Data

Select a .txt file. The file contains frame indices and skeleton data during the whole video.

14. Frame List

Display the frame list. The small flag indicates that the data of the index is valid. The last number means this frame has been labeled. Number means the label index.

15. Frame Config

  • steps: how many frames does a sample need in maximum. Default: 60
  • cover: how many frames is coincided between two samples. Default: 30

16. Select

  • Select: Select and display next batch frames automatically. Maximum selection number is steps
  • ScrollBar: Display speed control. Left: Slow, Right: Fast

17. Generate

  • In Auto mode: Click Generate button, it will process all the data in the current frame list, then generate many samples according to steps and cover
  • In Mann mode: Click Generate button, it will process only the selected frames in the frame list, then generate only one sample
  • If 'shift' key is pressed at the same time, click Generate button, the program request you to choose a folder in wich all the data file locate. After you choose it, the program will automatically process all the existing data folder.

18. Search

Choose an existing label from the left box. Then click and to search for the last or next one.

19. Make Label

There are 6 labels when labeling. Choose at least one frame in the frame list, then click any button with the best label. It will update a Data_labels.md in the Raw data Output folder to record all the labels.

20. Count

The number showed the current number of each valid label.

21. Delete

If you want to cancl some existing labels to Empty, then choose the target frames in frame list, Click Del.

3-3 Folder Tree

+-- Debug: Application.StartupPath
|    +-- Output: Save the skeleton data
     |     +-- yyyy-MM-dd HH-mm-ss: Save the skeleton data
           |    +-- Data.txt: Indices and skeleton data
           |    +-- Data_labels.md: Indices and labels

3-4 HOW TO USE IT

3-4-1 Capture only

  • Make sure the depth camera is connected.
  • Click 5. Grab, 1. Show Image will display images in real time.

3-4-2 Capture while automatically predicting

  • Check 6. Auto to choose a .pb model file. Then the program is in Auto mode. Click 5. Grab to start capturing. The 2. Show Label area will show the real time prediction.

3-4-3 Capture while writing data

  1. Firstly click 7. Write button, the folder will be created under the Output folder with the format of current time yyyy-MM-dd HH-mm-ss as the folder name. For example, creating a folder named 2019-01-10 10-40-54.
    • Further, in the 2019-01-10 10-40-54 folder, a txt file named Data.txt is generated.
    • Data.txt: The information of the skeleton data, also include the frame index. The first line is Skeleton data (X, Y, Z) * 25 points. , it will be ignored in the following processing.
  2. Secondly click 5. Grab button to capture image, at the same time, write images and skeleton data to the Data.txt.
  • Tip: You can also check 6. Auto and 7. Write both. But the program will take more time to process.

3-4-4 Make a test

  1. Click 10. Load to load an existing .pb model file.
  2. Click 11. Test to open the file dialog and choose a sample. Then make the prediction using the loaded model. The accuracy result will shown in 3. Log Info area.

3-4-5 Label

  1. Click 12. Tap Option to switch the program into Label mode.
  2. Click 13. Load Data to choose a Data.txt from the output folder. Then load the frames and skeleton data. The frame will be shown in 1. Show Image area, the skeleton data will be shown in 14. Frame List area. The small flag in the list indicates that the data under the index is valid.
  3. Choose at least one frame in 14. Frame List, then click the right button in 19. Make Label, all the chosen frames will be labelled immediately (it's overwritten). The 20. Count will count the number in each label.
  4. If you made wrong labels and want to CANCL them, click 21. Delete, the label will be reset to empty.

3-4-6 Select automatically

  • When you are making label, you don't neet to select frames mannually. You can set 15. Frame Config. The first number steps means how many frames does a sample need in maximum, default value is 60. The second number cover means how many frames is coincided between two samples, default value is 30.
  • Click Select button to do the automatic selection. The program will always generate a batch with steps frames. Move the scrollbar to control the display speed.

3-4-7 Generate

  • If you choose Auto mode, click the Generate button. The program will generate many samples from the whole 14. Frame List.
  • If you choose Mann mode, you need to choose some frames and click the Generate button. The program will generate only one sample according to the chosen frames.
  • [One-Click Function] If you pressed Shift key, no matter in which mode, click the Generate button. Choose a folder that contains all the yyyy-MM-dd HH-mm-ss folders. The program will automatically generate samples through all the folders.

3-4-8 Search

  • Choose an existing label from the left box. Then click and to search for the last or next one.

4 Deep Learning

All the code is mainly based on Python 3.7 and Keras. I utilize Bidirectional LSTM layer and Dense layer. The final accuracy is around 91%.

4-1 Network structure

I use Keras, and combine BiLSTM and Dense layer together. Due to the Standing and Walking is hard to distinguish. Even in the software framework, I make 2 labels for them. But when it turns to deep learning training, I regard these 2 labels as one label Walking. So there are only 5 classes in the end.
network

4-2 Data distribution

By frame numbers:

Sitting Walking Standup Sitdown Falling All
train 10744 13434 8947 3334 13270 49729
val 0 0 0 0 0 0
test 2750 3779 2407 762 2877 12575
All 13494 17213 11354 4096 16147 62304

By sample numbers (many frames in one sample):

Train Val Test All
train 892 0 223 1115

4-3 Training Parameters

  • batch_size_train=16
  • batch_size_val=8
  • epochs=15
  • learning_rate=1e-4
  • learning_rate_decay=1e-6

4-4 Accuracy

With CRF (Conditional Random Field)

Train Train-Fall Test Test-Fall
1 0.9193 0.9414 0.9026 0.9369
2 0.9188 0.9469 0.9071 0.9529
3 0.9197 0.9541 0.8823 0.9449
4 0.9180 0.9502 0.8849 0.9323
5 0.9230 0.9493 0.8803 0.9374

Without CRF

Train Train-Fall Test Test-Fall
1 0.9244 0.9552 0.8976 0.9156
2 0.9229 0.9266 0.8741 0.8220
3 0.9260 0.9437 0.8964 0.9142
4 0.9137 0.9387 0.8983 0.9415
5 0.9138 0.9426 0.8996 0.9357

Tip: Train-Fall means only the Falling label accuracy in train set. Because this project puts more focus on the Falling posture.

4-5 HOW TO USE IT

  1. Please use Anaconda to create a new environment. Open the cmd, cd to the main folder where the PostureRecognition.yml is.
    conda env create -f PostureRecognition.yml
    Then a new env is available. No matther which IDE you are using, make sure to run the .py file in the new env.
  2. run ./data/sample_reduce.py to generate the training data. Because the data in ./data/Samples/ contains many 'Walking' and 'Standing' labels.
  3. run train_sequence.py to do the training and validation.

5 Issues

5-1 Unstable detecting for human joints because of a high position

Compared to ZpRoc, I fix the depth camera with a pillar in a high position (he put the depth camera on the desktop). It influences the performance of Nuitrack to detect the joints. Because Nuitrack suggests developers to put camera in a height around 0.8m-1.2m, but mine is around 2m. I guess the reason of the bad performance is because of the perspective relationship.

5-2 More data

I have only 1115 samples in total. Each of them contains about 60 frames. It's not a big dataset. So there should be more data.

5-3 Long cable

If you want to get a good performance of the depth camera, you should use USB3 cable. I bought a long cable, but it's USB2, so sometimes the connection is bad.

5-4 Joints match human body

As you see from the software framework image just now, the joints don't match the human body so well. It's because the human body comes from RGB camera, and joints data come from depth camera. So if we want to match them, we need to do calibration.

6 Future Plan

Due to the bad performance of Nuitrack in detecting joints. We plan to use the raw depth image from the depth camera. Find where is the man, then find what is the man's posture. depth

About

Posture Recognition


Languages

Language:C# 69.6%Language:Python 28.0%Language:MATLAB 2.5%