`phiv_timestep` performance problems for scalar equations

Question

`phiv_timestep` performance problems for scalar equations

ArneBouillon opened this issue 6 months ago · comments

Describe the bug 🐞

I consider this a bug since the performance difference is so severe, but I could move this to a feature request if desired.

In general, functions like phiv_timestep and expv_timestep are very fast for moderately-sized matrices. However, I'm noticing orders of magnitude in degradation, both in performance and in allocations, when the matrices are size-1.

It is possible for applications to circumvent this by manually implementing the functionality for scalar equations, but it would clearly be more convenient if the performance of these methods for scalar problems could be improved.

Expected behavior

Size-1 equations should be as fast as, or even faster than, larger ones.

Minimal Reproducible Example 👇

using ExponentialUtilities
function time(N)
        A = randn(N, N)
        b = randn(N, 1)
        @time phiv_timestep(1, A, b);
end

time(1)   # Warm up
time(1)   # 3.109664 seconds (18.85 M allocations: 18.841 GiB, 7.80% gc time)
time(2)   # 0.003811 seconds (3.81 k allocations: 406.266 KiB)
time(100) # 0.002961 seconds (261 allocations: 543.359 KiB)

Environment (please complete the following information):

Output of using Pkg; Pkg.status()

[d4d017d3] ExponentialUtilities v1.25.0

Output of versioninfo()

Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

Christopher Rackauckas · Answer 1 · Tue Nov 28 2023 11:46:57 GMT+0800 (China Standard Time)

Yes, I think a fallback to scalar operations on size 1 would be appropriate. It would be good to make it fast. I don't think anyone looked into that case. It's rather surprising to me that it's giving something so different. There's an allocating generic matmul, and that definitely doesn't need to exist in the size 1 case.

ArneBouillon · Answer 2 · Tue Nov 28 2023 16:29:52 GMT+0800 (China Standard Time)

There has to be some unforeseen effect in the generic phiv_timestep path when using size-1 matrices. It's likely stuck in a loop somewhere for a long time, since it does 18 million allocations totaling many gigabytes. Perhaps some stopping criterion somewhere was not designed with size-1 in mind, and struggles to be satisfied? Or maybe there is wrap-around somewhere.

ArneBouillon · Answer 3 · Tue Nov 28 2023 17:31:00 GMT+0800 (China Standard Time)

Update: The error seems to stem from the default calculation of tau. For size-1 matrices, the m parameter is set to 1 as well, which gives a tau on the order of 1e-7 on this line.

This tau calculation is derived from Equation 17 in this paper, but seems to have been incorrectly ported to the code. The paper uses the variable m_ave, which "is the average of the input and
maximum allowed size of the Krylov subspace", while the code uses min(10, size(A, 1)). Something similar is going on in Expokit.jl, which uses min(30, size(A, 1)).

The calculation of tau seems very sensitive to this m value. For size 1, the tau calculated is absolutely tiny, and for around 2-4, it is still quite small. After that, it seems to recover to more sensible values. However, I am not confident as to what "the maximum allowed size of the Krylov subspace" indicates exactly. @ChrisRackauckas do you have more insight into where this default value for tau comes from, and why both ExponentialUtilities.jl and Expokit.jl seem to use a different variable as m than the paper suggests?

Christopher Rackauckas · Answer 4 · Tue Nov 28 2023 21:22:21 GMT+0800 (China Standard Time)

We might've gotten it from Expokit. It's worth trying the paper's form instead.

ArneBouillon · Answer 5 · Wed Nov 29 2023 19:17:43 GMT+0800 (China Standard Time)

What would you then take to be "the maximum allowed size of the Krylov subspace"?