jianzongwu / Disentangled-Clustering-Contrastive-Learning-of-Time-Series-Representation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Time Series Representation -- Cross dataset

Folders:

  • eval/: experimental results (on UCR archive)
  • training/: trained models in experiments (corresponding to eval/)
  • models/
  • data/
  • ./*.ipynb: experimental notebooks

Requirements

  • Python 3.8
  • torch==1.8.1
  • scipy==1.6.1
  • numpy==1.19.2
  • pandas==1.0.1
  • scikit_learn==0.24.2

The dependencies can be installed by:

pip install -r requirements.txt

Data

There are 128 UCR datasets and processed datasets in data/processed (most are several datasets in UCR concatenated).

  • 128 UCR datasets should be put into data/ so that each data file can be located by data/UCRArchive_2018/<dataset_name>/<dataset_name>_*.tsv.

Usage

To train and evaluate, run the following command:

python train.py <dataset_name> --run_name <run_name> --datapath <data_path> --iters <iter_num> --eval

note the evaluation is run on all UCR datasets.

The detailed descriptions about the arguments are as following:

Parameter name Description of parameter
dataset_name The dataset name
run_name The folder name used to save model, output and evaluation metrics. This can be set to any word
gpu The gpu no. used for training and inference (defaults to 0)
batch_size The batch size (defaults to 10)
lr The learning rate (defaults to 0.001)
K The number of cluster encoders (defaults to 3)
sim_fun The cluster similarity function(defaults to cosine)
cate_fun The cluster function (defaults to softmax)
iters The number of training iterations (defaults to 2000)
save_every Save the checkpoint every <save_every> iterations/epochs(defaults to 100)
valid Whether to save and valid model every <save_every> iters(defaults to False)
latest Whether to save model in a latest model folder(defaults to False)
seed The random seed(defaults to 0)
max-threads The maximum allowed number of threads used by this process(defaults to 8)
eval Whether to perform evaluation after training

After training and evaluation, the trained encoder, output and evaluation metrics can be found in training/DatasetName__RunName_Date_Time/.

Code Example

from model import OursModel
import utils

# Load 6_data (formed from 6 dataset from UCR)
train_data, train_labels, test_data, test_labels = utils.load_UCR_dataset('./data/processed', '6_data')
# (Both train_data and test_data have a shape of n_instances x n_timestamps)

# Train a model
model = OursModel(
    input_dims=1,
    device=0,
    output_dims=320
)
loss_log = model.fit(
    train_data,
    verbose=True
)

# Compute representations for test set
test_repr = model.encode(test_data)  # n_instances x output_dims

# other tests on representations

About


Languages

Language:Jupyter Notebook 60.8%Language:Python 39.2%