[CPU] cholespy very slow compared to scikit-sparse (factor ~15)

Question

[CPU] cholespy very slow compared to scikit-sparse (factor ~15)

EmJay276 opened this issue a year ago · comments

Hi, really nice to have a sparse cholesky solver which is compatible with windows out of the box!

Can you please verify I'm doing everything correctly? I have a lot longer runtime compared to scikit-sparse ~ factor 15.

I am not using any TPU / GPU, just plane CPU and numpy / scipy.

I use a lower triangle sparse matrix K_iso in CSC format (I also tested COO, same results) and a sparse load vector f_csc

K_iso
<39624x39624 sparse array of type '<class 'numpy.float64'>'
	with 848667 stored elements in Compressed Sparse Column format>
f_csc
<39624x1 sparse array of type '<class 'numpy.float64'>'
	with 3033 stored elements in Compressed Sparse Column format>

scikit-sparse run takes 0.82 s

from timeit import default_timer
from sksparse.cholmod import cholesky

start_time = default_timer()
factor = cholesky(K_iso)
u_iso = factor.solve_A(f_csc)
print(f"Done ({default_timer() - start_time:.2f} s)")

# Done (0.82 s)

cholespy run (double precision) takes 13.19 s - of which CholeskySolverD takes allmost time (13.18 s)

from timeit import default_timer
from cholespy import CholeskySolverD, MatrixType

x = np.empty(K_iso.shape[0])
f = f_csc.todense().squeeze()

start_time = default_timer()
solver = CholeskySolverD(K_iso.shape[0], K_iso.indptr, K_iso.indices, K_iso.data, MatrixType.CSC)
solver.solve(f, x)
print(f"Done ({default_timer() - start_time:.2f} s)")

# Done (13.19 s)

The result is exactly the same

np.allclose(x, u_iso.todense().squeeze())

# True

Baptiste Nicolet · Answer 1 · Tue Mar 07 2023 17:49:03 GMT+0800 (China Standard Time)

Hi,

Could you please provide the matrix you use (or at least its dimensions) so I can try to reproduce this on my end?

Michael Jäger · Answer 2 · Tue Mar 07 2023 18:34:55 GMT+0800 (China Standard Time)

Hi,

the shape is 39624 x 39624 (its a small FEM stiffness matrix), I also have matrices with >150k where scikit-sparse takes ~10-20 sec to solve.

The matrices are attached using scipy npz format.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.load_npz.html

K_iso = scipy.sparse.load_npz('K_iso.npz')
f_csc = scipy.sparse.load_npz('f_csc.npz')

matrices.zip

Michael Jäger · Answer 3 · Tue Mar 21 2023 16:53:35 GMT+0800 (China Standard Time)

@bathal1 could you reproduce the issue?

Baptiste Nicolet · Answer 4 · Tue Mar 21 2023 22:27:26 GMT+0800 (China Standard Time)

This is caused by the factorization type used by cholespy (simplicial). It seems that the matrix in your example benefits from using a supernodal factorization.

I am unsure of what the proper fix would be to allow to automatically pick the most efficient one. In the meantime, I suggest you clone the repo and comment this line. At least in the example you provided, factorization is much faster.

In order to build cholespy locally, you can clone the repo and then pip install ./cholespy