xiaoshengli / SPF

Code for "Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Symbolic Pattern Forest (SPF)

This repository contains the code accompanying the paper, "Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest" (Xiaosheng Li, Jessica Lin and Liang Zhao, IJCAI 2019). This paper proposes a time series clustering algorithm that has linear time complexity.

To Compile the Code

Assume using a Linux system:

g++ -O3 SPF.cpp libmetis.a -std=c++11 -o SPF

Or can directly use the compiled file SPF included in the folder.

To Run the Code

./SPF [datasetname] [ensemble_size]

[datasetname] is the name of the dataset to run, the user needs to place a folder named with the [datasetname] and the folder contains a training file [datasetname]_TRAIN and a testing file [datasetname]_TEST (The UCR-Archive format). [ensemble_size] is the ensemble size. Please see the FaceFour example contained in the directory.

Example

./SPF FaceFour 100

Output:

dataset:FaceFour, ensemble size:100
rand index: 1
The running time is: 1.860000seconds

Note

The code uses a char array buffer of size 1000000 to read each line of the input file, so if the time series to use is very long, the characters that each line the input file contains may surpass the limit. In this case the buffer limit (line 32 of SPF.cpp, MAX_PER_LINE) should be enlarged correspondingly.

Citation

@inproceedings{ijcai2019-406,
  title     = {Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest},
  author    = {Li, Xiaosheng and Lin, Jessica and Zhao, Liang},
  booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on
               Artificial Intelligence, {IJCAI-19}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {2930--2936},
  year      = {2019},
  month     = {7},
  doi       = {10.24963/ijcai.2019/406},
  url       = {https://doi.org/10.24963/ijcai.2019/406},
}

About

Code for "Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest"

License:GNU General Public License v3.0


Languages

Language:C++ 57.7%Language:C 42.3%