Semi Time series classification

The main idea is to combine the MeanTeacher with the series saliency module. While improving the accuracy of the model, it can also enhance the interpretability quantitatively and qualitatively. Compared with the above work that only improves accuracy, it may provide more insights.

Environment

sklearn 0.22.1
numpy 1.16.4
pytorch >= 1.7
torchgeometry 0.1.2

Dataset

The data composed of 6 publicly available datasets downloadable from (Download link), and save them under the directory datasets/. The following are the detailed parameters of the three data sets I have completed the experiment.

Dataset	Train	Test	Dimension	Class
UWaveGestureLibraryAll	2688	894	945	8
CricketX	458	156	300	12
InsectWingbeatSound	1320	440	256	11

Structure

mainOurs.py includes some options. The following are the important options, the python script takes along with their description:

option

--dataset
- The experiments include 16 datasets. The previous papers mainly design experiments on the 6 datasets. Up until now, for each dataset, I ran for 5 times (random seed 0,1,2) and recorded the mean and variance. As shown in the experiments, there is a significant improvement compared with the previous SOTA results.
--model_name
- It includes three opinions.
  - SupCE: The supervised training procedure
  - SemiTime: The previous SOTA baselines
  - VT2: Our method ( VT2 + Series Saliency).
--label_ratio
- The option is used to limit the proportion of labeled data.
--Saliency
- The option is to indicate whether using series saliency module in the MeanTeacher.
Other parameters are some detailed parameters.

Usage example

After introducing the results of previous code, some examples for running commands.

## MeanTeacher
python mainOurs.py --model_name VT2 --dataset=CricketX --gpu=2 --label_ratio 0.4

## Semi time Method
python mainOurs.py --model_name SemiTime --dataset=CricketX --gpu=2 --label_ratio 0.4

## Supervised method
python mainOurs.py --model_name SupCE --dataset=CricketX --gpu=2 --label_ratio 0.4

Architecture

The model architecture is intuitive, which used VT2 method to the semi-supervised learning of time series. We combine it with the previously proposed series saliency module. As shown in the figure, we can guess the design idea of the model. The implementation details are in code. At present, the algorithm significantly improves accuracy. On the other hand, we validated the series saliency module is helpful in semi-supervised learning. This is good news! 🎉 🎉 😄

The second part is to use the series saliency for interpretation in time series semi-supervised learning. I will implement the codes, and migrate from time series forecasting to time series classification. We'll provide more quantitative and qualitative analysis. The motivation is to observe learning procedure with increasing label size. The phenomenon may require more domain knowledge and cherry-pick some visualization.

Finally, I think that easy-to-implementation series saliency can significantly improve prediction accuracy and interpretability, contributing to the time series semi-supervised learning!

Experiments results

We mainly compare the latest two papers on time series supervised learning. The second paper reproduces the results of the first paper. There are some other baseline methods are implemented in second paper.(like Pi model or pseudo label). I only put the strongest baseline in the table. Therefore, we will compare their methods. The experiment results show that the series saliency is also an effective augmentation. Now the more visualization results will be added (like t-sne).

Label Ratio	10%	20%	40%	100%
Dataset	ChinaTown
SemiTime	44.88 (3.13)	51.61 (1.22)	58.71 (2.78)	65.66 (1.58)
MeanTeacher	45.54 (1.16)	51.59 (1.98)	62.87 (1.69)	67.32(0.12)
MT w/ SS	47.31(2.21)	53.87(1.12)	63.45 (1.28)	69.31(0.11)

Dataset	MFPT
SemiTime	54.96 (1.61)	59.01 (1.56)	62.38 (0.76)	66.57 (0.67)
MeanTeacher	56.33 (2.1)	61.21 (2.17)	63.37(0.92)	67.53(1.98)
VT2 w/ SS	57.24 (2.27)	61.47 (1.91)	64.9 (2.1)	68.99(1.98)

Dataset	Epilep
SemiTime	81.46(0.60)	84.57(0.49)	86.91(0.47)	90.29(0.32)
MeanTeacher	91.92 (1.52)	92.11(0.32)	94.37(0.30)	95.13(0.21)
VT2 w/ SS	92.28 (0.51)	94.94(0.68)	96.36(0.71)	97.11(0.11)

Dataset	MFPT
SemiTime	64.16(0.85)	69.84(0.94)	76.49 (0.54)	84.33(0.50)
MeanTeacher
MT w/ SS

Dataset	Epilep
SemiTime	74.86(0.42)	75.54(0.63)	77.01(0.79)	79.26(1.20)
MeanTeacher
MT w/ SS

Reference

The two papers were not presented at the top-tier conference. I think the main reason is the lack of further analysis for semi-supervised learning.

[1]SEMI-SUPERVISED TIME SERIES CLASSIFICATION BY TEMPORAL RELATION PREDICTION

[2]Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning

[3]Self-supervised Learning for Semi-supervised Time Series Classification

pqy000 / SemiTimeSeries