divided7 / SSSGCN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SSSGCN

It is currently being organized and will be open-sourced. The code and data will be made public after undergoing our de-identification review.

Datasets

We collected real Tai Chi video data, which was professionally annotated with scores by sports experts. This data aims to explore potential complex action features, differing from traditional classification-based rating evaluations, such as grading actions as A, B, C, or D levels.

Why do we use continuous variables as labels: Although it may be cumbersome to modify the granularity of performance ratings established in classification tasks, it is generally possible to adjust them through methods such as reorganizing datasets and retraining models. Additionally, it is generally true that finer-grained classification tasks tend to be more challenging. Adopting smoothed labels and regression models can indeed lead to higher performance and finer-grained assessments, which better align with real examination and teaching scenarios. Although it requires more significant effort, this approach is more in line with real-world applications.

Why don't We directly compare feature values as in facial recognition: In action scoring tasks, directly comparing feature values may overlook spatial and temporal information of the actions. Additionally, sports experts have pointed out that the evaluation of scores should not solely rely on the similarity of actions; it involves a certain level of subjectivity or artistry. We aim for our data to provide this information and enable the model to represent it.

https://drive.google.com/drive/folders/1ZTsiah25xqdNVz9kxE4-tHAG2uSbF-AC?usp=drive_link

Augmentation

8k_时间域增强分布 16k_时间域增强分布
8k_aug 16k_aug
生成样本各分数段分布 生成样本各分数段分布(principle=0 6)
principle=0.4 principle=0.6
生成样本各分数段分布(principle=1) 均衡后生成样本各分数段分布
principle=1.0 clip

One-Stage

We initially aimed to achieve both classification and regression simultaneously through a one-stage approach. However, despite our efforts, the final classification and regression performance (as shown in model iv) did not meet our expected metrics.Additionally, under the guidance of experts, we designed a reasonable data augmentation method.

Model Structure

i) and ii)

image

iii)

image

iv)

image

Exp

Model Taichi score MAE Taichi classification Acc
i 0.2021 59.17%
ii 0.0965 84.42%
iii 0.0862 86.26%
iv 0.0782 95.58%

i) Extract features using the ST-GCN backbone and feed the obtained feature map into both the classification and regression heads with CoLU.

ii) Building upon i, using the data augmentation.

iii) Building upon ii, split the feature map along the spatial dimension into two parts, and then separately feed them into the classification and regression heads.

iv) Building upon iii, concatenate the feature embeddings from the classification head with the input to the regression head.

Two-Stage

Cls Exp

image image image

NTU-RGB-D Ablation

ST-GCN vs STD-GCN vs SST-GCN vs SSTD-GCN vs ST-GCN++ vs SSTD-GCN++ Demo

Open In Colab

Google Colab Demo Note: the metrics in the Colab demo might experience slight variations due to version changes, but the overall performance should be approximately similar.

Taichi Cls Ablation

Open In Colab

Google Colab Demo

Model NTU-RGB-D Taichi Param. (M) FLOPs (G)
ST-GCN 76.00% 65.47% 0.17 0.20
STGL-GCN 77.50% 83.75% 2.78 1.89
SSTD-GCN(ours) 87.00% 99.17% 0.18 0.11
ST-GCN++ 90.50% 93.33% 3.09 0.60
SSTD-GCN++(ours is embedded to ST-GCN++) 92.00% 99.58% 0.32 0.61

Reg Exp

Taichi Scoring Reg Ablation

Model Spacial Separate Temporal Dilation Taichi score MAE
ix 0.0355
x 0.0295
xi ✔️ 0.0243
xii ✔️ 0.0261
xiii ✔️ ✔️ 0.0196

About


Languages

Language:Python 100.0%