SciML / ExponentialUtilities.jl

Fast and differentiable implementations of matrix exponentials, Krylov exponential matrix-vector multiplications ("expmv"), KIOPS, ExpoKit functions, and more. All your exponential needs in SciML form.

Home Page:https://docs.sciml.ai/ExponentialUtilities/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`phiv_timestep` performance problems for scalar equations

ArneBouillon opened this issue Β· comments

Describe the bug 🐞

I consider this a bug since the performance difference is so severe, but I could move this to a feature request if desired.

In general, functions like phiv_timestep and expv_timestep are very fast for moderately-sized matrices. However, I'm noticing orders of magnitude in degradation, both in performance and in allocations, when the matrices are size-1.

It is possible for applications to circumvent this by manually implementing the functionality for scalar equations, but it would clearly be more convenient if the performance of these methods for scalar problems could be improved.

Expected behavior

Size-1 equations should be as fast as, or even faster than, larger ones.

Minimal Reproducible Example πŸ‘‡

using ExponentialUtilities
function time(N)
        A = randn(N, N)
        b = randn(N, 1)
        @time phiv_timestep(1, A, b);
end

time(1)   # Warm up
time(1)   # 3.109664 seconds (18.85 M allocations: 18.841 GiB, 7.80% gc time)
time(2)   # 0.003811 seconds (3.81 k allocations: 406.266 KiB)
time(100) # 0.002961 seconds (261 allocations: 543.359 KiB)

Environment (please complete the following information):

  • Output of using Pkg; Pkg.status()
[d4d017d3] ExponentialUtilities v1.25.0
  • Output of versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 Γ— Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

Yes, I think a fallback to scalar operations on size 1 would be appropriate. It would be good to make it fast. I don't think anyone looked into that case. It's rather surprising to me that it's giving something so different. There's an allocating generic matmul, and that definitely doesn't need to exist in the size 1 case.

There has to be some unforeseen effect in the generic phiv_timestep path when using size-1 matrices. It's likely stuck in a loop somewhere for a long time, since it does 18 million allocations totaling many gigabytes. Perhaps some stopping criterion somewhere was not designed with size-1 in mind, and struggles to be satisfied? Or maybe there is wrap-around somewhere.

Update: The error seems to stem from the default calculation of tau. For size-1 matrices, the m parameter is set to 1 as well, which gives a tau on the order of 1e-7 on this line.

This tau calculation is derived from Equation 17 in this paper, but seems to have been incorrectly ported to the code. The paper uses the variable m_ave, which "is the average of the input and
maximum allowed size of the Krylov subspace", while the code uses min(10, size(A, 1)). Something similar is going on in Expokit.jl, which uses min(30, size(A, 1)).

The calculation of tau seems very sensitive to this m value. For size 1, the tau calculated is absolutely tiny, and for around 2-4, it is still quite small. After that, it seems to recover to more sensible values. However, I am not confident as to what "the maximum allowed size of the Krylov subspace" indicates exactly. @ChrisRackauckas do you have more insight into where this default value for tau comes from, and why both ExponentialUtilities.jl and Expokit.jl seem to use a different variable as m than the paper suggests?

We might've gotten it from Expokit. It's worth trying the paper's form instead.

What would you then take to be "the maximum allowed size of the Krylov subspace"?