Consider MatRepr for `repr` and `_repr_html_`

Question

Consider MatRepr for `repr` and `_repr_html_`

alugowski opened this issue a year ago · comments

I've noticed that there appears to be no native way to print/display a sparse matrix.
The docs always convert to dense for display, and the native methods only emit some metadata.
todense can be troublesome when the matrix is large, and the result does not visualize sparsity.

I went ahead and added pydata/sparse to matrepr.

You can see it in action with pydata/sparse matrices (1D, 2D, 3D) in this Jupyter notebook.

For the Python REPL, there is a simple monkey-patch that replaces __repr__() and prints matrices in an interactive Python shell: import matrepr.patch.sparse

Is this something of interest to the pydata/sparse community?

Example of a random 2D COO matrix:

Hameer Abbasi · Answer 1 · Tue Aug 29 2023 14:28:20 GMT+0800 (China Standard Time)

Would this require an additional dependency? If not it's very welcome. If it is something that's going to require a dependency, I'd make it optional.

I'm definitely on board with the idea of a better repr though.

Adam Lugowski · Answer 2 · Tue Aug 29 2023 14:56:29 GMT+0800 (China Standard Time)

Apart from matrepr itself, the string formatter has a tabulate dependency. HTML and Latex formatters have no additional dependencies.

Adam Lugowski · Answer 3 · Tue Aug 29 2023 15:05:21 GMT+0800 (China Standard Time)

For reference, this should be enough:

def _repr_html_(self):
    from matrepr.adapters.sparse_driver import PyDataSparseDriver
    return to_html(PyDataSparseDriver.adapt(self), notebook=True)

def __repr__(self):
    from matrepr.adapters.sparse_driver import PyDataSparseDriver
    # Enable terminal width detection
    return to_str(PyDataSparseDriver.adapt(self), width_str=0, max_cols=9999)

Hameer Abbasi · Answer 4 · Tue Aug 29 2023 16:47:03 GMT+0800 (China Standard Time)

Would you be willing to make a PR that reverts back to the old behaviour in the absence of matrepr?

Adam Lugowski · Answer 5 · Wed Aug 30 2023 03:36:17 GMT+0800 (China Standard Time)

Yes, I'll submit one soon.

Adam Lugowski · Answer 6 · Wed Aug 30 2023 12:37:26 GMT+0800 (China Standard Time)

See PR: #605

Adam Lugowski · Answer 7 · Thu Sep 07 2023 14:42:49 GMT+0800 (China Standard Time)

Here is one question: should empty cells remain empty, or display the array's fill_value?

matrepr now supports a fill_value argument so doing either is easy. The question is which is better. A fill value shows the semantics of the array but hides the sparsity.

Hameer Abbasi · Answer 8 · Thu Sep 07 2023 15:01:42 GMT+0800 (China Standard Time)

Here is one question: should empty cells remain empty, or display the array's fill_value?

matrepr now supports a fill_value argument so doing either is easy. The question is which is better. A fill value shows the semantics of the array but hides the sparsity.

If the fill value is indicated elsewhere I think it's better for them to be empty.

Adam Lugowski · Answer 9 · Fri Sep 08 2023 05:01:58 GMT+0800 (China Standard Time)

If the fill value is indicated elsewhere I think it's better for them to be empty.

Sounds good, that's the behavior in the PR. The fill value is in the summary line.

Consider MatRepr for `__repr__` and `_repr_html_`

Consider MatRepr for `repr` and `_repr_html_`