GregorySchwartz / too-many-cells

Cluster single cells and analyze cell clade relationships with colorful visualizations.

Home Page:https://gregoryschwartz.github.io/too-many-cells/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prune parameters setting question

DiracZhu1998 opened this issue · comments

Dear @GregorySchwartz ,

I'm using TooManyCells on my dataset. It works great. But I have trouble understanding the prune parameters in the "docker run gregoryschwartz/too-many-cells:2.0.0.0 make-tree -h". Could you explain these prune parameters in more detail?

Also, could you give me some advice on improving the plotting attached below to make "small leaves" more biological meaningful? Because it's hard for a non-model organism to define cell subtype truly, it's arbitrary in the standard Seurat workflows.

plot1 based on "--draw-collection "PieChart" --smart-cutoff 4 --draw-no-scale-nodes --draw-mark "MarkModularity" --draw-max-node-size 10" parameters
plot2 based on "--draw-collection "PieChart" --draw-no-scale-nodes --draw-mark "MarkModularity" --draw-max-node-size 10 --min-size 100" parameters.

plot1
plot2

Thanks a lot!

Thanks for your interest! So the first plot isn't pruning anything as you did not specify a prune parameter (--smart-cutoff only statistically decides what the cutoff value should be, not which cutting should be used, such as --min-size and --min-distance-search, you can read about it in the paper https://www.nature.com/articles/s41592-020-0748-5). The second plot is just pruning based on the minimum allowed size for a node.

In our own experience, using --smart-cutoff with --min-distance-search provides fantastic results, as it statistically searches for heterogeneous populations and collapses homogeneous populations.