angus924 / minirocket

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

minirocket_multivariate extremely slow

turmeric-blend opened this issue · comments

commented

My setup is that I am using large dataset (10,000+) and I pass data as batches into model. I do not cache the data and run transform every time I pass data into model on every epoch. I run this same setup for both

minirocket.py with input shape (32768,99) and

minirocket_multivariate.py with input shape (32768,1,99) so the number of channel is 1.

I find that the minirocket_multivariate.py version runs significantly more slow on every transform() relative to minirocket.py.

Is there a potential bug in the code?

Hi @turmeric-blend, good question. Yes, the multivariate implementation is quite a lot slower than the univariate implementation. (This is partly why, for now at least, they are separate, I have also separated MiniRocket and MiniRocketMultivariate in sktime.)

I am aware of the problem. Unfortunately, I haven't had the chance yet to work out whether there is a straightforward way to 'fix' it. There may be a way to avoid whatever is causing the problem by rearranging the oprerations a bit, or it might be that there is some issue with how numba handles the additional dimension.

Basically, if you have univariate data, just use the univaraite implementation. If you have multivariate data, unfortunately at the moment the speed issue is unavoidable.

There is a GPU implementation coming (this is not my work), which will most likely be a lot faster for most multivariate datasets, at least until I can work out how to fix this issue with the CPU implementation.

commented

ok, thanks for letting me know.