pydata / sparse

Sparse multi-dimensional arrays for the PyData ecosystem

Home Page:https://sparse.pydata.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Difference in sparse.einsum and np.einsum behaviour

kumgleb opened this issue · comments

First of all thanks for einsum implementation in sparse!

I faced with a problem using it in my project, the problem is follows:

  • I need to use einsum without implicit subscripts, here is a simple example in numpy:
v1 = [[1, 2], [3, 4]]
v2 = [0, 1]

values = np.einsum(v1, v2)
print(values)
>>> array([[1, 2],
           [3, 4]])
  • If I try the same in sparse I get an error:
v1 = sparse.COO(np.array([[1, 2], [3, 4]]))
v2 = sparse.COO(np.array([0, 1]))

values = sparse.einsum(v1, v2)
values
>>> UFuncTypeError: ufunc 'equal' did not contain a loop with signature matching types (<class 'numpy.dtype[int64]'>, <class 'numpy.dtype[str_]'>) -> None
  • But if I provide subscripts it works fine:
v1 = sparse.COO(np.array([[1, 2], [3, 4]]))
v2 = sparse.COO(np.array([0, 1]))

values = sparse.einsum('ab->ab', v1, v2)
print(values.todense())
>>> array([[1, 2],
           [3, 4]])

So my question is: is it possible to avoid subscripts in sparse.einsum in the same way as in numpy.einsum?

sparse version 0.14.0
numpy version 1.23.1

I believe it isn't possible, as of now. Is it required for your use-case?

@hameerabbasi
Yes, I'm trying to use it for the sparse factor product in MRF, and most convenient way is to provide only operands and indices, here is the example in numpy:
https://github.com/pgmpy/pgmpy/blob/d1131144af2a2a4b3ef341613170538c808194aa/pgmpy/factors/discrete/DiscreteFactor.py#L704

I trying to implement the same, but with sparse matrices.

I think there are maybe two questions here:

  • implicit output: einsum('ab', x) - i.e. the output is not given but automatically computed, this is handled already
  • interleaved format: einsum(array_0, indices_0, array_1, indices_1, ...[, output_indices]), this is not supported yet, but is just a matter of parsing that could be lifted from e.g. opt_einsum.

Note that einsum('ab->ab, v1, v2) is not actually valid! There should only be one term, and that last call should probably error, as it does with numpy...

Yeah, it seems that convert_interleaved_input function from opt_einsum may help to solve the problem.
@jcmgray @hameerabbasi big thanks to you guys for a quick response!

By the way, you could use opt_einsum.contract as a drop in here and it should work with sparse, handle the input and also perform contraction order (for 3+ terms) optimization.

Actually, since #579 your case is working and the interleaved format is supported. However, you need to update sparse to the last development version from github.

@HadrienNU thanks for a reply, already manage to do it with parsing from opt_einsum, but using it from the lib will be more convenient.

Actually, since #579 your case is working and the interleaved format is supported. However, you need to update sparse to the last development version from github.

Ah sorry I missed this PR, thanks for adding the additional parsing!

Since the question has been answered and the required feature is in master, I'm going to close this issue.