Difference in sparse.einsum and np.einsum behaviour

Question

Difference in sparse.einsum and np.einsum behaviour

kumgleb opened this issue a year ago · comments

First of all thanks for einsum implementation in sparse!

I faced with a problem using it in my project, the problem is follows:

I need to use einsum without implicit subscripts, here is a simple example in numpy:

v1 = [[1, 2], [3, 4]]
v2 = [0, 1]

values = np.einsum(v1, v2)
print(values)
>>> array([[1, 2],
           [3, 4]])

If I try the same in sparse I get an error:

v1 = sparse.COO(np.array([[1, 2], [3, 4]]))
v2 = sparse.COO(np.array([0, 1]))

values = sparse.einsum(v1, v2)
values
>>> UFuncTypeError: ufunc 'equal' did not contain a loop with signature matching types (<class 'numpy.dtype[int64]'>, <class 'numpy.dtype[str_]'>) -> None

But if I provide subscripts it works fine:

v1 = sparse.COO(np.array([[1, 2], [3, 4]]))
v2 = sparse.COO(np.array([0, 1]))

values = sparse.einsum('ab->ab', v1, v2)
print(values.todense())
>>> array([[1, 2],
           [3, 4]])

So my question is: is it possible to avoid subscripts in `sparse.einsum` in the same way as in `numpy.einsum`?

sparse version 0.14.0
numpy version 1.23.1

Hameer Abbasi · Answer 1 · Wed Mar 01 2023 13:12:42 GMT+0800 (China Standard Time)

Cc @HadrienNU @jcmgray

Hameer Abbasi · Answer 2 · Wed Mar 01 2023 13:13:28 GMT+0800 (China Standard Time)

I believe it isn't possible, as of now. Is it required for your use-case?

Gleb Kumichev · Answer 3 · Wed Mar 01 2023 13:31:23 GMT+0800 (China Standard Time)

@hameerabbasi
Yes, I'm trying to use it for the sparse factor product in MRF, and most convenient way is to provide only operands and indices, here is the example in numpy:
https://github.com/pgmpy/pgmpy/blob/d1131144af2a2a4b3ef341613170538c808194aa/pgmpy/factors/discrete/DiscreteFactor.py#L704

I trying to implement the same, but with sparse matrices.

Johnnie Gray · Answer 4 · Wed Mar 01 2023 13:39:38 GMT+0800 (China Standard Time)

I think there are maybe two questions here:

implicit output: einsum('ab', x) - i.e. the output is not given but automatically computed, this is handled already
interleaved format: einsum(array_0, indices_0, array_1, indices_1, ...[, output_indices]), this is not supported yet, but is just a matter of parsing that could be lifted from e.g. opt_einsum.

Note that einsum('ab->ab, v1, v2) is not actually valid! There should only be one term, and that last call should probably error, as it does with numpy...

Gleb Kumichev · Answer 5 · Wed Mar 01 2023 15:02:06 GMT+0800 (China Standard Time)

Yeah, it seems that convert_interleaved_input function from opt_einsum may help to solve the problem.
@jcmgray @hameerabbasi big thanks to you guys for a quick response!

Johnnie Gray · Answer 6 · Wed Mar 01 2023 15:11:40 GMT+0800 (China Standard Time)

By the way, you could use opt_einsum.contract as a drop in here and it should work with sparse, handle the input and also perform contraction order (for 3+ terms) optimization.

Hadrien · Answer 7 · Wed Mar 01 2023 18:50:16 GMT+0800 (China Standard Time)

Actually, since #579 your case is working and the interleaved format is supported. However, you need to update sparse to the last development version from github.

Gleb Kumichev · Answer 8 · Wed Mar 01 2023 19:02:23 GMT+0800 (China Standard Time)

@HadrienNU thanks for a reply, already manage to do it with parsing from opt_einsum, but using it from the lib will be more convenient.

Johnnie Gray · Answer 9 · Thu Mar 02 2023 10:05:38 GMT+0800 (China Standard Time)

Actually, since #579 your case is working and the interleaved format is supported. However, you need to update sparse to the last development version from github.

Ah sorry I missed this PR, thanks for adding the additional parsing!

Hameer Abbasi · Answer 10 · Sat Mar 04 2023 21:02:33 GMT+0800 (China Standard Time)

Since the question has been answered and the required feature is in master, I'm going to close this issue.

Difference in sparse.einsum and np.einsum behaviour

So my question is: is it possible to avoid subscripts in sparse.einsum in the same way as in numpy.einsum?

So my question is: is it possible to avoid subscripts in `sparse.einsum` in the same way as in `numpy.einsum`?