minirocket_multivariate extremely slow

Question

minirocket_multivariate extremely slow

turmeric-blend opened this issue 3 years ago · comments

My setup is that I am using large dataset (10,000+) and I pass data as batches into model. I do not cache the data and run transform every time I pass data into model on every epoch. I run this same setup for both

minirocket.py with input shape (32768,99) and

minirocket_multivariate.py with input shape (32768,1,99) so the number of channel is 1.

I find that the minirocket_multivariate.py version runs significantly more slow on every transform() relative to minirocket.py.

Is there a potential bug in the code?

angus924 · Answer 1 · Fri Apr 16 2021 10:02:09 GMT+0800 (China Standard Time)

Hi @turmeric-blend, good question. Yes, the multivariate implementation is quite a lot slower than the univariate implementation. (This is partly why, for now at least, they are separate, I have also separated MiniRocket and MiniRocketMultivariate in sktime.)

I am aware of the problem. Unfortunately, I haven't had the chance yet to work out whether there is a straightforward way to 'fix' it. There may be a way to avoid whatever is causing the problem by rearranging the oprerations a bit, or it might be that there is some issue with how numba handles the additional dimension.

Basically, if you have univariate data, just use the univaraite implementation. If you have multivariate data, unfortunately at the moment the speed issue is unavoidable.

There is a GPU implementation coming (this is not my work), which will most likely be a lot faster for most multivariate datasets, at least until I can work out how to fix this issue with the CPU implementation.

tu · Answer 2 · Fri Apr 16 2021 10:11:19 GMT+0800 (China Standard Time)

ok, thanks for letting me know.