minirocket_multivariate extremely slow
turmeric-blend opened this issue · comments
My setup is that I am using large dataset (10,000+) and I pass data as batches into model. I do not cache the data and run transform
every time I pass data into model on every epoch. I run this same setup for both
minirocket.py
with input shape (32768,99)
and
minirocket_multivariate.py
with input shape (32768,1,99)
so the number of channel is 1.
I find that the minirocket_multivariate.py
version runs significantly more slow on every transform()
relative to minirocket.py
.
Is there a potential bug in the code?
Hi @turmeric-blend, good question. Yes, the multivariate implementation is quite a lot slower than the univariate implementation. (This is partly why, for now at least, they are separate, I have also separated MiniRocket and MiniRocketMultivariate in sktime.)
I am aware of the problem. Unfortunately, I haven't had the chance yet to work out whether there is a straightforward way to 'fix' it. There may be a way to avoid whatever is causing the problem by rearranging the oprerations a bit, or it might be that there is some issue with how numba handles the additional dimension.
Basically, if you have univariate data, just use the univaraite implementation. If you have multivariate data, unfortunately at the moment the speed issue is unavoidable.
There is a GPU implementation coming (this is not my work), which will most likely be a lot faster for most multivariate datasets, at least until I can work out how to fix this issue with the CPU implementation.
ok, thanks for letting me know.