biocore / empress

A fast and scalable phylogenetic tree viewer for microbiome data analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support drawing variable-length stacked sample metadata barplots

fedarko opened this issue · comments

Where, rather than having all tips' bars be of equal length and coloring them based on proportions, bars' lengths are scaled based on the number of samples total that contain a tip. So even if two tips are only present in a single group of samples, if one tip is present in 100 samples and another is only present in 1 sample there'll still be a clear visual difference between their bars.

This would probably look similar to the bar plots in Fig. 2A of this paper, with the difference that those graphs use relative abundance where this would just be based on sample presence count. (We could totally scale bars by actual abundance data, also, but that'd require a lot of restructuring to do.)

Brought up by @kwcantrell in #313 (comment).

Update: prototype of this is now ready in this branch. TODOs before submitting a PR:

  • Use different logic for scaling lengths (warn user properly if max length > min length; gently account for the case where all features are present in the same number of samples -- alert user about how many samples, etc.)
  • Add tests for getFrequencyMap()'s new behavior
  • Fix weird problem where the Number of samples containing a tip title goes outside of the borders in the exported SVG legend: this is #528. Maybe make a separate PR for that?
  • Abstract reused code for creating min/max length values in the barplot layers (currently duplicated btwn fm and sm init functions)
  • Update README (?)

Screenshot of this functionality:

image