Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cloud Security] Removed errors thrown at resources table #152944

Merged

Conversation

JordanSh
Copy link
Contributor

@JordanSh JordanSh commented Mar 8, 2023

Resolves #151016

Summary

Removed error throwing in case of missing buckets. Now returns undefined instead.
undefined values are being handled on the FE side.
Additionally, I removed the filter option from the belongs_to column, this field is generated on runtime and is not mapped on the data view. causing filtering to be a bit more complex. we can add that into 8.8 perhaps, I don't think it was a demand for 8.7 and I rather we not push it as a feature to this version.

@JordanSh JordanSh added release_note:skip Skip the PR/issue when compiling release notes Team:Cloud Security Cloud Security team related v8.7.0 labels Mar 8, 2023
@JordanSh JordanSh self-assigned this Mar 8, 2023
@JordanSh JordanSh marked this pull request as ready for review March 8, 2023 17:54
@JordanSh JordanSh requested a review from a team as a code owner March 8, 2023 17:54
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security)

@JordanSh JordanSh requested a review from opauloh March 8, 2023 17:55
@Omolola-Akinleye Omolola-Akinleye self-requested a review March 8, 2023 20:27
@@ -160,8 +194,10 @@ export const useFindingsByResource = (options: UseFindingsByResourceOptions) =>
)
throw new Error('expected buckets to be an array');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also remove the throw new Error from lines 195 and 189 as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question, and this is a big subject overall. can we ever throw errors?
So I think some fields we can manage without errors, the fields that we no longer throw errors over are meta-data fields, which in the worst case scenario, won't be displayed in the table and instead, we will just have an empty cell.

But the errors you mentioned are for the aggregations over the actual resources. This group by resource feature, and the table in general, cannot work or display anything without aggregated resource ids. so if this process failed because we don't resource ids aggs, we don't have anything to display.
Personally, I think that in this case, it's better to throw an error than display a completely empty table for example. both are "useless" but the errors at least give us and users the option to know what's going on.

Please let me know if you think otherwise, or have a different experience regarding this issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, throwing an error on the client side due to something missing in an object property can be a bad user experience. When users encounter an error message, they may feel frustrated or confused, and this can negatively impact their experience with the application.

In this particular code, the errors thrown indicate that it expected buckets to be an array, and this is not due to any user action or input.

In the video, we have here the two approaches we're discussing, on the first table view, we are handling missing data as empty inputs, and when grouping by resource we are throwing an error:

Screen.Recording.2023-03-09.at.3.55.54.PM.mov

For this case, we ended up concluding that Cloudbeat should not generate a finding pointing to an empty resource id, and that was fixed, but my point here is that it can always be argued that these errors should be handled in a more graceful manner that does not disrupt the user's experience.

For example, if the view throwing an error is meant to be a group by resource.id table, it shouldn't work when there's no resource.id, that means we could prevent the error state by adding a filter to require resource.id: exists from the ES query in the first place.

"This group by resource feature, and the table in general, cannot work or display anything without aggregated resource ids."

As you can see in the video, some of the resources were missing, and some were not, but by throwing an error, we broke the ability for the user to check other findings that weren't in the error state, in my opinion, we should avoid throwing errors that break the entire page when there might be a potential for showing partial data, I believe showing partial data can be considered a more graceful manner of handling errors that do not negatively impact the user's experience with the application.

But I would like to open that discussion to know what are other opinions about that - cc @tinnytintin10 / @kfirpeled / @tehilashn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should avoid throwing errors that break the entire page when there might be a potential for showing partial data

I totally agree with that approach
Is that still the case here after this PR fix? From what I saw, in the latest changes, it seems that only when one of aggregations, aggregations.resources & aggregations.count are missing, we will throw that error. However, is it even possible? If so, how do you suggest to fix that? because I'm guessing we don't have any results to present in this case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opauloh is the example video you posted before or after this pr? this pr meant to fix this problem and indeed we do not throw errors when we can display at least some of the data after this fix.

before, we had some resource ids that were missing some data, and we threw instead of displaying what we can.
but you are suggesting not to throw even when we don't have any aggregations at all. This means none of the findings have a resource id. If that's the case I don't think we can display a resource table since we can't aggregate over any of them.

this is an extreme scenario, i think that we should throw an error to indicate something has indeed gone wrong. might even be more frustrating to the user if we didn't threw and he wouldn't understand why the grouping isn't working.

i don't think errors are some taboo we should never use. they exist for a reason and are used hundreds of times in kibana. we definitely want to be careful with them, and avoid them when we can.

do you think it's better to return an empty array here instead of throwing? let me know, i think we should prioritize merging this as early as we can for now, so we can iterate over it in the next BC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better when we can avoid the error state by adding a filter on the ES query that filters out the bad data that causes the error

As the example below, the UI throws the error, knowing that an empty resource.id is causing the error, I can filter out empty resource id and get the partial results:

Screen.Recording.2023-03-16.at.1.31.54.PM.mov

But our error message isn't descriptive enough for the user to know that there's bad data (expected buckets to be an array don't tell much), So the suggestion is to add that filter right in the getFindingsByResourceAggQuery function.

{
    "bool": {
        "filter": [
            {
                "bool": {
                    "filter": [
                        {
                            "exists": { "field": "resource.id" }
                        }
                    ],
                    "must_not": [
                        {
                            "match_phrase": { "resource.id": "" }
                        }
                    ]
                }
            }
        ]
    }
}

And we can also keep throwing the error, but it would be nicer if the message were more helpful for the user, like:

Expected buckets to be an array. This might be caused by invalid data in the Findings index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opauloh @JordanSh do we have a followup ticket to better handle edge cases like this in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opauloh @JordanSh is this still relevant?

Copy link
Contributor

@opauloh opauloh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled the branch locally and grouping by resource doesn't throw the error anymore 🚀

image

@JordanSh JordanSh enabled auto-merge (squash) March 16, 2023 23:42
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
cloudSecurityPosture 149.4KB 149.0KB -386.0B
Unknown metric groups

ESLint disabled line counts

id before after diff
securitySolution 433 436 +3

Total ESLint disabled count

id before after diff
securitySolution 513 516 +3

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @JordanSh

@JordanSh JordanSh merged commit 06c9924 into elastic:main Mar 17, 2023
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Mar 17, 2023
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.7

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Mar 17, 2023
) (#153252)

# Backport

This will backport the following commits from `main` to `8.7`:
- [[Cloud Security] Removed errors thrown at resources table
(#152944)](#152944)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT
[{"author":{"name":"Jordan","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-03-17T03:35:39Z","message":"[Cloud
Security] Removed errors thrown at resources table
(#152944)","sha":"06c9924a158b84096f87fffa498e4f44254c65d0","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Cloud
Security","v8.7.0","v8.8.0"],"number":152944,"url":"https://github.com/elastic/kibana/pull/152944","mergeCommit":{"message":"[Cloud
Security] Removed errors thrown at resources table
(#152944)","sha":"06c9924a158b84096f87fffa498e4f44254c65d0"}},"sourceBranch":"main","suggestedTargetBranches":["8.7"],"targetPullRequestStates":[{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/152944","number":152944,"mergeCommit":{"message":"[Cloud
Security] Removed errors thrown at resources table
(#152944)","sha":"06c9924a158b84096f87fffa498e4f44254c65d0"}}]}]
BACKPORT-->

Co-authored-by: Jordan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:Cloud Security Cloud Security team related v8.7.0 v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Cloud Security] Group by findings table - handle missing buckets (prevent crash)
6 participants