arrange strata to minimize flow overlaps
corybrunson opened this issue · comments
For cases in which the strata have no intrinsic order, an option should be available to arrange the strata at each axis, perhaps even using different orders at different axes when the strata are repeated (maybe this should be allowed or not according to an additional parameter), in a way that minimizes the number of flow overlaps. The majors
example in the vignette is a good candidate for such an option.
A heuristic algorithm should suffice, and the concept is general enough that it might already be in use somewhere. I won't have time to write one for a while, and it might be worth writing in C and calling via Rcpp
.
Working on this in the optimization branch, making use of the iterpc package. Some considerations:
- The reordering function should be exported and output a data frame different only in the orders of the levels of the
value
variable (which will be coerced to factor if not already). - This function should then be callable by a new parameter of
stat_alluvium()
andstat_flow()
only if the new ordering can then be passed tostat_stratum()
(without re-running the optimization procedure). In this case, it should be overridden by an explicit argument todecreasing
(to avoid accidental consumption of time). - The reordering should be independent of any argument to
reverse
, which should apply after any reordering. - The reordering function needn't concern itself with overlaps within pairs of strata at adjacent axes, although
stat_alluvium()
might introduce these. - A parameter (currently
free.strata
) will allow the user to control whether sets of strata appearing at different axes must (FALSE
) or needn't (TRUE
) be consistently reordered. - The user should be able to control which axes (a) may not be reordered, (b) may be reversed but nothing else (preserving adjacency), and (c) may be arbitrarily reordered.
- All these constraints can be satisfied in both the
exhaustive
and theheuristic
optimizing functions bynext
ing permutation lists in which they are not met.
The problem with (if necessary, introducing, and) reordering factor levels, and with exporting the function that does so, is that it impacts the behavior of any aesthetics or other features that depend on the factor levels, whether their order or their number. This is turning out to be messier than i'm comfortable with.
An alternative approach is to internally construct a stratum.ordering
matrix analogous to lode.ordering
in StatAlluvium$compute_panel
, which would depend on some guidance parameter like free.strata
and which the user could also provide directly. The problem here is that, unless the other layers can receive the argument provided to stat_stratum()
, the user would have to provide the argument to each layer individually, at the risk of producing a wrong diagram—and one that might at a glance appear fine.
Failing this (implicitly passing arguments between layers), i'll develop this feature without the free.strata
parameter, i.e. only subject to strata values taking the same vertical order at each axis. The most recent attempt to allow variations in stratum order is at 36bc937.
From your comments, I'm getting the feeling that this wasn't resolved?
I'm having issues with the ordering of the stratum and posted an example here. I probably just am missing something simple but any help would be appreciated.
This issue (#6) is (still, intermittently) being experimented upon in the optimization branch. I think it's unrelated to the issue you describe at the link. Indeed, strata defined using a factor variable should be arranged in order of the factor levels at any axis. I'll make this a separate issue and get to it ASAP. Thanks for raising it!
@mbojan points to a good reference at this issue, which i'm closing in order to reduce clutter.