Multivariate time series analysis

Question

Multivariate time series analysis

smetanadvorak opened this issue 7 years ago · comments

Konstantin Akhmadeev commented 7 years ago

Hello,

Thank you for such an amazing tool. I'm wondering about how to approach multivariate time series using hctsa. Is there a special way of assigning the keywords before using TS_LabelGroups?

Thank you in advance,
Konstantin

Ben Fulcher · Answer 1 · Fri Sep 29 2017 08:42:36 GMT+0800 (China Standard Time)

Hi Konstantin. No worries, and thanks for your interest ☺️
Depending on your problem, you can use hctsa for multivariate time series in different ways. Probably you'll want to incorporate some measures of coupling between time series (which is not part of hctsa). How best to assign labels to time series using TS_LabelGroups depends on the problem. If you give me more info, perhaps I can give more specific advise...

Konstantin Akhmadeev · Answer 2 · Fri Sep 29 2017 17:21:22 GMT+0800 (China Standard Time)

Thanks for quick answer Ben,
I'm working on a two-class classification problem were each data instance contains 8-channel EMG. Let's assume that I use only one-channel features (no coupling between TS). How I see it: I extract the features in each of 8 channels, then I stack them in one feature vector. Then I do all the necessary to run TS_TopFeatures and ... get the top feature-channel pairs :)
Well, after having it described, I see that this approach cannot be realised by means of some smart labelling. I probably should look for top features for each channel disjointly.

Ben Fulcher · Answer 3 · Mon Oct 02 2017 07:53:43 GMT+0800 (China Standard Time)

Yes, you could combine as you say (expanding columns), but that would involve some manual work if you want to hack it for use within the hctsa architecture. The simple case of being channel-blind (i.e., just labeling all rows) could be done easily with TS_TopFeatures. Might give you an indication of how much signal is in your data.
If you do go down the approach of expanding columns as all feature-channel pairs, you'll likely need to reduce the number of features (will depend on the size of your dataset, but is typically hard to constrain a learning problem containing 8x7000~56000 features!) Many approaches to doing this; could be data-driven on your data (e.g., dimensionality reduction, or some hard clustering), or using an independent, high-dimensional dataset (e.g., the Empirical1000 set).

Konstantin Akhmadeev · Answer 4 · Mon Oct 09 2017 21:34:58 GMT+0800 (China Standard Time)

Thank you Ben, sorry for late answer! Well, I'll think about it and tell here if come up with a solution.
All the best.