Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] [Discuss] How should users configure pipeline functions such as cumulative sum, derivative? #56696

Closed
wylieconlon opened this issue Feb 3, 2020 · 11 comments
Labels
discuss Feature:Lens Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@wylieconlon
Copy link
Contributor

Elasticsearch offers a set of pipeline aggregations which take aggregated data and perform another set of calculations on the response. Based on the requirements for our use in Lens, we might extend this concept with our own "pipeline-like" functions- a good example might be percentages or filter ratios.

Kibana users see two interfaces that wrap this concept already. Should Lens use one of these patterns, or a third pattern?

The patterns are:

  1. To use a pipeline aggregation in a bar chart in Visualize, you configure a nested aggregation editor.

Screenshot 2020-02-03 17 21 24

  1. To use a pipeline aggregation in TSVB, you are presented with a list of references to metrics.

Screenshot 2020-02-03 17 22 09

Let's discuss the pros and cons of these options, and if neither of them is applicable, come up with a new pattern.

@wylieconlon wylieconlon added discuss Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Feb 3, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@wylieconlon
Copy link
Contributor Author

I think it's important to say that these are the functions that are useful to run on pre-aggregated data, which is a different editing model than most other tools we're familiar with. There are three categories that we would likely support, which are roughly matching Elasticsearch functions:

  • Filter/Select/Project: Takes any length input, returns a smaller length based on the criteria. For example, filter input > 1 on [3, 1, 2] is [3, 2] or [3, null, 2] depending on how we handle gaps.
  • Reduce/Aggregate: Takes any length input and returns one output. For example, average of [3, 1, 2] is 2.
  • Window: Every value of the input is transformed, so the input and output are the same length. Cumulative sum of [3, 1, 2] is [3, 4, 6]

What kind of visualization do we want users to make using these broad categories?

  • Filters: Because filters create gaps, the visualization needs to support discontinuous data. Good visualizations:
    • Scatter chart
    • Table
  • Reduce: Depends on how many levels are summarized.
    • Big number
    • If there is another level, we can display a bar chart with summary information horizontally
  • Window: All visualizations can support this, because the length is the same

@wylieconlon
Copy link
Contributor Author

@AlonaNadler Are there specific visualizations you expect users to build using pipeline functions?

@cchaos
Copy link
Contributor

cchaos commented Feb 10, 2020

From a UI perspective, @AlonaNadler and I have decided that these aggregations (under the Quick Functions umbrella) should only entail one level of nesting. Therefore, the only thing different between them and a normal (sum, avg, etc) aggregation, is the addition of a Function field which contains sum, avg, etc...

Cumulative sum

Anything more complicated than this would fall under the "Builder" tab which is for another discussion.

@wylieconlon
Copy link
Contributor Author

@cchaos You have mocked up an example for a "Window/Map" type function, but I don't think you've addressed the other two types: Filter and Reduce functions.

@cchaos
Copy link
Contributor

cchaos commented Feb 10, 2020

What aggregations (that exist in Kibana) are those related to? I need more examples of how you create them not the logic behind them. Examples from TSVB or visualize help the most.

@wylieconlon
Copy link
Contributor Author

Sure, examples follow.

Filters:

Reduce:

  • Bucket script
    • Exposed directly in TSVB
    • Filter Ratio in TSVB uses this internally
    • Math in TSVB uses this
  • Avg Bucket
    • Called "overall average" in TSVB
  • Bucket stats
    • Exposed as "overall std. deviation" in TSVB

@AlonaNadler
Copy link

If possible @wylieconlon, to start let's focus on 4 types of pipeline aggregation first:

  • Cumulative sum

  • Derivatives - default to interval derivatives

  • Moving average - default to simple moving average

  • Percent difference (% X over X) default to interval derivatives

  • Bucket selector (correct me if I'm wrong) is like "having" in SQL. It is an extremely powerful capability. I think it will be limiting to have as a separate aggregation. Theoretically, it can be applied to any calculation done in Lens, that's why needs a different approach.

*All the aka "overall" aggregation needs a different approach as well where on top of a chart users want to calculate an overall. I'm not sure just listing them as a function will be the right way to go, even though they are pipeline aggregations. The ideal solution to me is to add an overall metric when a xy as a metric alongside the chart or if in donut to add for example overall average in the center.

@wylieconlon
Copy link
Contributor Author

@AlonaNadler Thanks for the priorities! I agree that these window functions are the easiest ones to start with. Even if we start with the window functions, we can't begin implementing this without considering the other two types, because those other two types are already supported by TSVB and Visualize.

I'm also hoping that by looking closely at what TSVB supports will give us ideas on how to improve Lens. For example, TSVB is trying to make complicated functions user-friendly by exposing "Filter Ratio", "Math", "Bucket Std. Deviation", and others. Are the functions exposed in TSVB done well? How would that work in Lens?

@cchaos
Copy link
Contributor

cchaos commented Feb 12, 2020

Here are some mocks for the aggregations that @AlonaNadler mentioned:

Cumulative sum

Cumulative sum

Derivative (Percent difference would behave the same)

Derivitive

Moving average

Moving average

@wylieconlon
Copy link
Contributor Author

I never got an answer to the question I was originally asking, and given that there hasn't been any more followup on this I'm going to close it without the answer. I will try another set of techniques to get the feedback that is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feature:Lens Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

4 participants