Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New type of filter to get all the values for a field within the current context #17571

Closed
wants to merge 2 commits into from
Closed

New type of filter to get all the values for a field within the current context #17571

wants to merge 2 commits into from

Conversation

alepuccetti
Copy link

Hi everyone,

My colleagues and I are working on a new filter type. We called it values because it gets all the values of a field given the current filter/time context.
I want to get the idea out so if someone of the Kibana dev team like this idea and has some time to look into I will appreciate the feedback we could work together to make it better code and design wise.
@nreese @spalger @Bargs @lukasolson: We talked about this, I would appreciate your take on this.

I am opening this PR against 6.2 instead of master for simplicity.

Relevant issue: #16702

Why we did it?

After we drilled down to in the dataset we often want to see what other data the entities matching the drill down filters have produced overall or with other filters applied. At the moment this could be achieved by creating a filter is one of with all the values we want to select but what if the list of values is 10k? You cannot manually add 10k element it is not usable. Our first workaround was to export all the values and create a DSL filtered query to select this values. But we wanted to be able to this within Kibana UI. Hence, this PR.

Description

  • We added a filter type called values that have a button to fetch all the values of the selected field considering the current filters/time context.
  • The fetch button actually fires an aggregation query (like the suggestion API) with the current context. so it returns the values of the field only on matching the current filters/time context.

Goal

Improve the data exploration of Kibana giving a quick and user-friendly way to do exploration like "if this than that"

@elasticmachine
Copy link
Contributor

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@spalger
Copy link
Contributor

spalger commented Apr 9, 2018

I know most of us are pretty busy right now (@Bargs and @lukasolson especially) but I'll take a look at this in the next couple days, thanks for submitting it!

@spalger
Copy link
Contributor

spalger commented Apr 9, 2018

Just to be clear, we'll need to get this ported over to master before merging, but should be able to review most of it in its current state.

@Bargs Bargs removed their request for review April 9, 2018 21:31
@alepuccetti
Copy link
Author

@spalger Thank you.

Just to be clear, we'll need to get this ported over to master before merging, but should be able to review most of it in its current state.

I totally agree I opened this PR to get a feedback about the design. I am planning to rebase on top of master as soon as 6.3 gets out. As on now, I am building my custom Kibana form source with my changes.

This filter type gets all the values for one field
matching the current state of time picker and filters on the
filter bar, then it makes a filter to select all of them.
@alepuccetti
Copy link
Author

I did some tidying up of the commits to make your life easier in the review

@alepuccetti
Copy link
Author

Actually, I also have a branch rebased on master but I did not have the time to test it.
https://github.com/huq-industries/kibana/tree/alepuccetti/select-all-values-master

@alepuccetti
Copy link
Author

Also now it is hardcoded to a 10k limit for the value, I would like to make this configurable maybe with an advanced setting or with an argument in the filter pop up.

}

function getBody(field, { query }) {
const aggregationSize = 10000;
Copy link
Contributor

@spalger spalger Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also now it is hardcoded to a 10k limit for the value, I would like to make this configurable maybe with an advanced setting or with an argument in the filter pop up.

Yeah, I think it would be idea to support this as a parameter and default to something smaller, like 100.

Copy link
Contributor

@spalger spalger Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the the name Top Terms or Top Values would be more appropriate for this filter, since it's always going to be limited by the size (even if we set it to something huge like 10k).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it would be idea to support this as a parameter and default to something smaller, like 100.

What about having the default value in an advanced setting? It is easier to have a custom default value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, the default can be read from the route handler with:

const defaultSize = await request.getUiSettingsService().get('...')

Copy link
Contributor

@spalger spalger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

server.route({
path: '/api/kibana/values/{index}',
method: ['POST'],
handler: async function (req, reply) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define validation rules for the supported params and payload parameters. I think it should look something like:

server.route({
  path: '/api/kibana/values/{index}',
  method: ['POST'],
  config: {
    validate: {
      params: Joi.object().keys({
        index: Joi.string().required()
      }).default(),
      payload: Joi.object().keys({
        field: Joi.string().required(),
        query: Joi.object().required(),
      }).required()
    },
    async handler(req, reply) {
      // ...
    }
  }
})

PS, the query parameter supports a lot of stuff. @elastic/kibana-operations @elastic/kibana-platform does this concern any of you?

method: ['POST'],
handler: async function (req, reply) {
const { index } = req.params;
const { field } = req.payload;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd merge this and the next line


const { callWithRequest } = server.plugins.elasticsearch.getCluster('data');
const body = getBody(field, query);
let fieldList = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This collects the list of values, not a list of fields, right?

const currentField = buckets[i].key;
if (typeof (currentField) === 'object') {
// its an array
fieldList = fieldList.concat(currentField);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no idea that bucket keys could be arrays.

Also, you could merge the handling of array/non-array bucket keys and avoid going from Array -> Set -> Array if you just use one from the beginning:

const response = await callWithRequest(req, 'search', { index, body });

const values = new Set();
for (const bucket of response.aggregations.aggResults) {
  for (const value of [].concat(bucket.key)) {
    values.add(value)
  }
}
reply(Array.from(values))

}
}
const uniqueValues = Array.from(new Set(fieldList));
reply(uniqueValues);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON array responses can have weird security implications, I suggest responding with an object that has a values property instead.

@@ -68,6 +69,10 @@ export function requestFetchParamsToBody(
index = indexList;
}


$rootScope.filterState = { query: body.query };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called for each request on the dashboard, which means it gets the specific filters for each panel. We need to find a better way to communicate this information to the values filter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to introduce more reliance on rootScope in courier. I've been trying hard to de-angularize it. Is there perhaps another way we can achieve the same result? (I ask this without actually looking at the code beyond spying this one tidbit).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the intention behind the discussion in #17571 (comment)


const postParams = {
field: field.name,
query: $rootScope.filterState
Copy link
Contributor

@spalger spalger Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than writing the filter state to $rootScope, maybe we can use a temporary SearchSource to determine the current filters? @elastic/kibana-discovery? Might work something like this:

import chrome from 'ui/chrome';
import { FilterBarQueryFilterProvider } from 'ui/filter_bar/query_filter';

export function filterParamsValuesController($http, $scope, courier, Private, Promise) {
  const baseUrl = chrome.addBasePath('/api/kibana/values');
  const queryFilter = Private(FilterBarQueryFilterProvider);

  function getRootSearchSource(searchSource) {
    const parent = searchSource.getParent();
    return parent ? getRootSearchSource(parent) : searchSource;
  }

  this.getAllFieldValues = function (field) {
    field = $scope.field;

    if (!field || !field.aggregatable || field.type !== 'string') {
      return Promise.resolve([]);
    }

    const searchSource = new courier.SearchSource();
    return searchSource
      .set('index', field.indexPattern)
      .set('filter', queryFilter.getFilters())
      // this is necessary because apps like dashboard modify the "root search source" in
      // the courier to ensure that all search sources inherit from the dashboard saved
      // object. Since not every app does this we override this by inheriting from the true
      // root by walking up until we find a searchSource that has no parents.
      .inherits(getRootSearchSource(searchSource))
      .getSearchRequestBody()
      .then(({ query }) => (
        $http.post(`${baseUrl}/${field.indexPattern.title}`, {
          field: field.name,
          query
        })
      ))
      .then(response => {
        $scope.$emit('value_event', response.data);
      });
  };

}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, these are just ideas, not tested or sure it works well in all situations :)

@stacey-gammon do you know why we don't set the search source in apps like Discover to be the root search source?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, but I don't even like that dashboard does it. It's created some strange bugs. See #17123 and #16641 - we are workin towards getting rid of the search source inheritance all together.
Instead we want dashboard to pass filters, time range and query explicitly down to it's embeddables.

I haven't looked really at any of this PR but I'd be worried that any reliance on searchSource inheritance will break in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably just pull in the queryFilter and timeFilter services, I'm mostly just worried about not doing things the same way they are done by the app itself that might cause the values retrieved by this filter to not match the values being rendered by the vis/discover/dashboard they are supposed to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't had much time to look into this either, but just want to second Stacey here. We will remove search source inheritance rather sooner than later (hopefully within one of the next minor versions). I really think we shouldn't build anything new that relies on that anymore. And of course also second, not introducing any new Angular code here, since we are really hardly trying to get rid of Angular.


const baseUrl = chrome.addBasePath('/api/kibana/values');

export function filterParamsValuesController($http, $scope, $rootScope) {
Copy link
Contributor

@spalger spalger Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to pull in the Promise service here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are trying to remove reliance on the angular promise service, and the filter editor should be moving to react eventually. I don't actually think we want to use the angular Promise class here. #13855

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is using a mix of angular promises from $http I think it should also return angular promises when it returns early. This is also an angular controller, so I think angular promises are appropriate.

return {
size: 0,
timeout: '60s',
aggs: {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more efficient way to get this values back? Terms aggregation are pretty expensive especially when you set up a very high aggregation size. In my use case, I am planning to have this number 50k.

@alepuccetti alepuccetti deleted the alepuccetti/select-all-values-6.2 branch April 16, 2018 10:42
@alepuccetti
Copy link
Author

I am going to wait for the change about de-angularization of this part before doing more changes

@timroes
Copy link
Contributor

timroes commented Apr 18, 2018

Thanks for your understanding and giving us the time to clean this up properly beforehand and many many thanks for your contribution ❤️

@alepuccetti
Copy link
Author

Also, I will rebase against master and update the PR.

@timroes better doing things once and the right way 😉

@alepuccetti
Copy link
Author

@timroes sorry I closed it by mistake can you reopen it?

@timroes
Copy link
Contributor

timroes commented Apr 18, 2018

Sorry, you deleted the branch, that's why GitHub doesn't allow reopening it. Just open a new one and refer to this PR number, so we still have the link to this discussion.

@alepuccetti
Copy link
Author

@timroes right, I was doing some cleanup. I cleaned up the wrong branch. I will open a new one against master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants