`RecurrenceMatrix` does not take advantage of symmetry
heliosdrm opened this issue · comments
The data of RecurrenceMatrix
is of type SparseMatrixCSC{Bool,Int}
. This means that, although the constructor passes a symmetric matrix with only the entries of the upper triangle, the underlying data in the resulting RecurrenceMatrix
makes a new "regular" sparse matrix duplicating them on the lower triangle.
As it is in version 1.3, this could be changed without modifications in the API if the data field of RecurrenceMatrix
is changed to Symmetric{Float64,SparseMatrixCSC{Bool,Int64}}
. However, with the FAN threshold method (#97) recurrence matrices can also be non-symmetric, so:
- Right now it is better to keep it as it is, to allow the introduction of the FAN threshold method without changing the API.
- For the next major version we could consider converting
RecurrenceMatrix
(and maybe the other types?) into parametric types, to take advantage of potential symmetry.
For the next major version we could consider converting
RecurrenceMatrix
(and maybe the other types?) into parametric types, to take advantage of potential symmetry.
Changing the type parameterization of RecurrenceMatrix
is in no way breaking, we could do it as soon as there is a PR that implements it. RecurrenceMatrix
can be parameterized on the type of its data
field, and this would allow further optimizations, e.g. dispatching on whether the type is Symmetric{Float64,SparseMatrixCSC{Bool,Int64}}
or just SparseMatrixCSC{Bool,Int}
Good, I'll write another PR in parallel with #97, taking care of making them mergeable without too many conflicts (I hope).
I have yet another suggestion @hkraemer and @heliosdrm . Pr #97 is massive. Its git difference is very large, and difficult to follow. Perhaps we can clean things up by :
- Removing the genuinely new FAN-related code from #97.
- Merging changes.
- Opening a new pr with the actual FAN changes.
- In the same PR, add the type information additions which allow us to know whether a RM is symmetric based on its type.
I know this sounds like tedious work, but there is actually only one reason I suggest it: it will make mistakes/errors/bad design decisions impossible to hide.
No need to merge non-FAN related changes from #97. It took me a while to catch it, but the 13 first commits of that branch were already squashed-merged into 1.3.0. Only its five last commits have relevant changes, and they are limited to matrices.jl
and a few new tests.
@hkraemer : do you want to create a new PR using the code of that branch since da507d6 (the commit that matches 1.3.0)? If that's a problem, I could do it this week.
Ah yes, true, thanks for clarifying that!
yepp, I can do that.
After a second thought... RecurrenceMatrix
does take advantage of symmetry: the underlying sparse matrix contains the data points in both triangles (upper and lower, when only one of them is strictly necessary), but the only disadvantage of this is the amount of memory needed to store them.
On the other hand, some calculations (e.g. of vertical structures) would be far more expensive if the underlying sparse matrix only contained the upper or lower triangle, and those which can take advantage of symmetry (diagonal structures) already do it - only the upper triangle is used, thanks to the fact that RecurrenceMatrix
is assumed to contain always symmetric data (and JointRecurrenceMatrix
too).
There would be additional difficulties if we use triangular sparse matrices wrapped into the Symmetric
type to save storage: functions for sparse arrays like rowvals
, nzrange
, etc. cannot be applied to such types, and accessing the underlying matrix to work around that is very error-prone (e.g. if for some reason the underlying matrix is not empty in the "ignored" triangle).
All in all, I think that in the end it is not that advisable to change the definition of the types to allow Symmetric
data fields. It is better to leave it as it is, with SparseMatrixCSC{Bool,Int}
data in all cases.
There are still questions regarding how to deal with symmetry if the FAN method is added. But I'll write about them in the corresponding PR (#102).