Dependencies updates
quentinblampey opened this issue · comments
Hello,
In my opinion, all dependencies are very common and/or lightweight, except datashader
which requires dask
/xarray
/pillow
among others. Since it is used in only one plot function, should we moved this inside an extra dependency? If someone runs scatter_density
without datashader
installed, we could log an error explaining how to install the extra.
Alternatively, instead of throwing an error, we could also plot a subset of cells if datashader
is not installed and add a warning like "Cells are subset. To show all cells, install datashader with 'pip install pytometry[performance]'"
But maybe you plan to implement other plot functions with datashader? In that case, I agree that it would be really preferable to keep it in the main dependencies. What do you think @mbuttner, @grst?
Also, what about moving nbproject
in the dev dependencies? Is there a reason to have this in the main dependencies?
thank you for your suggestions! I agree with the suggestion to move the datashader
package inside an extra dependency for the plotting library and only display subsets of cells with a corresponding warning message.
I am personally very keen at showing all cells wherever possible, which poses quite a challenge in flow cytometry and CYTOF. However, let me stress the importance of having a lightweight package first and extend the visualization functionality second. We can still discuss how to integrate the plotting functions at a later stage.
If scatter_density
is the only function that requires datashader, I think we can follow @ivirshup's suggestion to get rid of it altogether and replace it with np.histogram2d
(see scverse/governance#64 (comment)).
UMAPs/embeddings with millions of cells are also slow with matplotlib and can benefit from datashader (with categorical data it's not as trivial as to use histogram2d
). But for embeddings, in my experience, there's not a lot to gain from showing all cells vs. subsampling. Plus any performance uprades for sc.pl.umap
should probably be solved on the scanpy side.
About nbproject
: It is currently used for the testing of notebooks of the package, so I suggest to keep it.
Thanks @grst, I'll try the np.histogram2d
solution and do a PR if it looks promising.
Concerning nbproject
, if it's used only for testing, then we can move it to the "test"
dependencies, right?
Concerning nbproject, if it's used only for testing, then we can move it to the "test" dependencies, right?
I think there should be an additional group docs
with all the packages required to run the tutorial, including nbproject
.
Closed as completed.