Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable stacked barplots for feature metadata #506

Open
FranckLejzerowicz opened this issue Mar 25, 2021 · 3 comments
Open

Enable stacked barplots for feature metadata #506

FranckLejzerowicz opened this issue Mar 25, 2021 · 3 comments

Comments

@FranckLejzerowicz
Copy link

Hi,

I believe that it is currently not possible to make a stacked bar plots for groups of feature metadata variables. EMPress only considers one feature at a time, right? (e.g. one set of taxonomic levels, one set of differential values). However, some feature metadata variables might well be more insightful is presented stacked.

For example, if there is feature metadata available on the amount of say "biomolecule A" and "biomolecule B" produced by each microbe n a tree, one may desire to plot the amount of "biomolecules A and B" on the same, stacked barplot. I believe that the only way to achive this would be to create a dummy sample metadata file for this (where features would remain the rows, while ["biomolecule A", "biomolecule B"] would be specially-tailored columns). However, such solution could be too hacky, and, to my understanding, only one sample metadata can be passed.

Could it be a solution, for such barplots, to let the user select >1 category (using a check box instead of the dropdown), and in the background, EMPress would use the code to build as many "sample metadata" that there are selections (ie. barplots) and allow plotting multiple sample metadata information.

Also, note that the nice ability to represent the bar heights as a function of the feature metadata value (when continuous) - which I believe is also currently limited to one single feature - would take advantage of being available for multiple features, if these could be "interpreted" in EMPress as a samples in a sample metadata.

Sorry I have not looked at the code to elaborate such proposition - will do if time allows!

Thanks!

@fedarko
Copy link
Collaborator

fedarko commented Mar 30, 2021

Thank you for the suggestion! I think I understand your idea here -- this would involve allowing users to select multiple "quantitative" feature metadata categories, and then just directly plotting those as proportions in a stacked barplot?

Yeah, EMPress currently doesn't support that -- however, I think rearranging the code to do this should be doable without a ton of effort, since we already have code to draw these kinds of barplots.

To clarify a few points:

  1. Do you think it would be best if these stacked barplots all have the same "length" across each tip (like how sample metadata barplots currently work), or would it be better for you if the stacked barplots' lengths vary based on the total sum of the categories? (Or would both possibilities be meaningful for this application?) I think we could support both strategies. (Some examples below.)

    Constant length stacked barplots (Diet ring) Varying length stacked barplots

    Screenshot references described in Barplots in the circular / rectangular layout #201.

  2. Do you anticipate there being lots of feature metadata categories to include at once in these barplots? (e.g. could there be, say, hundreds of biomolecule categories to include?) If so, we may want to explore ways of automatically creating these barplots based on a list of categories, to avoid making users manually click 100 checkboxes or something.

  3. Do you have a (ideally small) example dataset describing this? This would help with testing.

@FranckLejzerowicz
Copy link
Author

Yes! A way to stack features metadata variables into a made-up-on-the-fly sample metadata could certainly do the trick for the purpose of one barplot.

Now for your questions, I think that:

  1. both possibilities be meaningful (if the feature metadata variables to stack are expressed in the same scale, then the total bar lengths would make sense)
  2. I have thought about this and I guess checkboxes would do as the number of variables to check in for a barplot should be low. Indeed, making a barplot with too many stacks would be pretty unclear, not to mention the colors recycling issue. Now - IMO - this is already a problem with the current barplots: lot of clicking (allowing a config file as in iTOL would actually be great, but that is another, significant "feature request").
  3. I can send you the fibers food tree: c.a. 180 tips, and its feature metadata are just four columns. Where shall I send this?

Thanks

@fedarko
Copy link
Collaborator

fedarko commented Apr 1, 2021

Thanks for the explanation! This helps a lot.

Now - IMO - this is already a problem with the current barplots: lot of clicking (allowing a config file as in iTOL would actually be great, but that is another, significant "feature request").

I think this will be possible when #131 is addressed in the future -- once using config files to save/load app state is possible, it should be pretty straightforward for us to automatically generate barplot configurations (in cases where people have tons of fancy barplot configurations they'd like to set up).

I can send you the fibers food tree: c.a. 180 tips, and its feature metadata are just four columns. Where shall I send this?

I guess my UCSD email works fine. I probably won't have the free time to work on this for some time, but it would be really great to support this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants