How to better display higher taxonomic unknown levels? #422

antgonza · 2020-10-14T14:49:20Z

While discussing large trees and coloring by taxonomy, we realized that by default Q2 assigns "Unassigned" to sequences that didn't get an assignment, for example:

and that Empress by default will fill missing levels with "Unspecified" and that this could become problematic when looking at higher levels, like this:

as it merges the assignments that were "Unassigned" in level 1, with those without a classification at that specific level.

The questions are:

What should we do about it? There were a couple of suggestions: (a) repeat the latests level available level in the missing ones, basically have "Unassigned" in all the levels or "k__Bacteria" if those are the highest level available; and/or (b) do 1 but also add the level we are looking in the label, from the example above: "Unassigned [L2]" in all the levels or "k__Bacteria [L2]"
Should we do this by default or control via a parameter?

Closes biocore#473 and closes biocore#422. Avoids having to store this stuff in the QZV, which will save a lot of space. This does still involve storing the "expanded" taxonomy for a given level all at once in memory, though -- I do not think this will be horrendous, since it is not that much extra data all things considered.

… ancestor info in the JS interface) (#487) * DOC: various match_inputs() comments/doc fixes * ENH: Pass ordered tax col names to Empress JS #473 See docs enclosed for discussion vs. bool flag approach * ENH: Implement ancestor fm retrieval in JS Closes #473 and closes #422. Avoids having to store this stuff in the QZV, which will save a lot of space. This does still involve storing the "expanded" taxonomy for a given level all at once in memory, though -- I do not think this will be horrendous, since it is not that much extra data all things considered. * STY: prettify * MNT: abstract fm retrieval func generation still gotta test (manually then automatically) tho... * TST: unbreak js tests * TST: unbreak some of the python tests not done yet * TST: unbreak all python tests * TST: abstract python taxcol checking * STY: fix flake8 long lines * STY: more flake8 fixes * TST: test #473 special fm stuff in JS * MNT: pass length fm retrieval through new funcs for sake of future consistency. really shouldn't change anything, behavior- or performance-wise * BUG: Change max width to be in vw units, not vh makes this slightly larger, i think * PERF/DOC: Add note abt repeated work - fm barplots

antgonza added the question label Oct 14, 2020

fedarko mentioned this issue Dec 22, 2020

Project phylogeny up tree if provided #471

Open

fedarko mentioned this issue Jan 19, 2021

Respect ancestors in feature metadata coloring/propagation #473

Closed

fedarko mentioned this issue Feb 6, 2021

[DON'T MERGE THIS PLS] Include ancestor information in processed taxonomy feature metadata #482

Closed

fedarko mentioned this issue Feb 17, 2021

Use ancestor information in taxonomy feature metadata (only computing ancestor info in the JS interface) #487

Merged

kwcantrell closed this as completed in #487 Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to better display higher taxonomic unknown levels? #422

How to better display higher taxonomic unknown levels? #422

antgonza commented Oct 14, 2020

How to better display higher taxonomic unknown levels? #422

How to better display higher taxonomic unknown levels? #422

Comments

antgonza commented Oct 14, 2020