pydata / sparse

Sparse multi-dimensional arrays for the PyData ecosystem

Home Page:https://sparse.pydata.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Avoid Unnecessary Tranpose?

smldub opened this issue · comments

edit to sparse/_common.py to improve tensordot speed for _dot_ndarray_coo
remove line 492:
b = b.T

change lines 1147 and 1148 to:
oidx2 = coords2[1, didx2]
out[oidx1, oidx2] += array1[oidx1, coords[0, didx2]] * data2[didx2]

This avoids the creation of a new COO matrix during the transpose process, which can be time consuming.
Is this messing up some normal usage of _dot_ndarray_coo? So far I have had no issues.
Possibly this change could be added to other _dot* functions to avoid creating extraneous COO matrices?

Sorry, I don't know how to fork stuff quite yet, but I'll read up and try to fork if no one can think of a major issue with this change.

Hi! Thanks for the suggestion. Did you, for example, run the test suite yet? If you can't do that right now, I'll take this with me and do it over the weekend.

Also, please feel free to ask for help, either on this issue or over on Gitter: https://gitter.im/pydata/sparse

I believe that this change doesn't actually work :(
I still think the idea is good though (avoiding tranpose because it calls the sparse.COO() function for what should just be an issue of swapping indices in the sum).
I'll run the test suite and follow up here when I figure it out.

Following up:
I ended up only removing the transpose for the _dot_ndarray_coo_type function
Using the tensordot benchmark for 100 runs:
New: 0.09302s +- 0.001313s
Old: 0.186462s +- 0.004102s

When I run the testing benchmark I only get back one more additional error than with a clean build, and it says
FAILED sparse/_common.py::BLACK
I am a little confused as to what means, but I am assuming it isn't dire because the clean build has 2 of the same type of error in it.

Hey! That one is a code style issue. Please install black with pip and inside the sparse directory, do black .

Issue dealt with in #537

Not so fast -- The PR must be merged first. 😉