angus924 / minirocket

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Size

tdincer opened this issue · comments

Thank you so much for making your work available! I have a quick question about the feature size. Looks like the minimum number of feature size is 84. Is there any harm in extracting 84 features and using only a subset them?

Hi @tdincer, great question. There's not necessarily any harm in doing that, but I would just make a couple of observations:

  • The number 84 comes from the fact that we use a fixed set of 84 convolutional kernels. By default, we apply each of these kernels with multiple dilation values, and then produce multiple features for each kernel/dilation combination, which is why the default number of features (10,000) is a lot more than 84.
  • Apriori, no particular kernel/dilation/etc is any 'better' or 'worse' than any other (i.e., a random subset of features is likely to work roughly as well as any other random subset of features).
  • However, in general you will usually find that the more features you use, the better the method will work. In other words, while any particular kernel is no better or worse than any other (without knowing more about the dataset), the features tend only to work well in aggregate. (So, if I had to guess, I would expect that the method will probably be fairly weak when using a very small number of features).
  • The way the code currently works, if you set the number of features to be 84, that means you are only using a single dilation for each kernel (in particular, a dilation of 1), and only producing a single feature for each kernel. This may not be an effective way of generating a small number of features. You may be better off generating a lot more than that (e.g., 1,000 or even the default 10,000), and then subsampling from this larger set. That way at least you will get a mixture of kernels/dilations/etc in your subsample.

I hope that helps!

Let me know if you have any further questions.

@angus924 Thank you so much for the detailed answer! This is extremely helpful.