We propose a semantic and adaptive graph network (SAGN) with relatively little computation. Figure~\ref{fig:networks} shows the overall framework. Specifically, we fuse the dynamic feature information (position information, motion information and velocity information) with the bone information (bone information and bone information based on velocity difference), and add the semantic information (joint type and frame index). The whole information fusion is carried out by means of splicing. The fused data is put into the three-layer adaptive GCN for learning, highlighting its adaptability and dynamic learning of graph structure. And the adjacency matrix of graph convolution is obtained by self-learning. The structure learned through the adaptive GCN is spliced with the semantic information of the frame index. In the process of CNN, three layers are designed, which are mainly used for feature extraction of time dimension. In the whole process, we test the model on two public datasets, NTU-RGB+D 60 and NTU-RGB+D 120, and get a good result.
Figure 1: Framework of the proposed semantic and adaptive graph network (SAGN). In the data fusion part, we fuse dynamic feature information, bone information and the joint types in the semantic information, and the fusion is carried out through splicing. In the adaptive GCN part, we designed a three-layer graph convolution structure, and learned the graph structure of each sample using the idea of residuals. We splice the learned graph structure and frame index in semantic information to get new fusion data. In the time dimension based triple CNN part, we design a three-layer convolution operation with two MaxPooling.
Figure 2: The working mechanism of adaptive GCN. There are two types of adjacency matrices,
The code is built with the following libraries:
We use the dataset of NTU60 RGB+D as an example for description. We need to first dowload the NTU-RGB+D dataset.
- Extract the dataset to ./data/ntu/nturgb+d_skeletons/
- Process the data
cd ./data/ntu
# Get skeleton of each performer
python get_raw_skes_data.py
# Remove the bad skeleton
python get_raw_denoised_data.py
# Transform the skeleton to the center of the first frame
python seq_transformation.py
# For the X-Sub setting
python main.py --network SGN --train 1 --case 0
# For the X-View setting
python main.py --network SGN --train 1 --case 1
- Test the pre-trained models (./results/NTU/SGN/)
# For the X-Sub setting
python main.py --network SGN --train 0 --case 0
# For the X-View setting
python main.py --network SGN --train 0 --case 1
- Note that some codes references SGN