-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Add feature importance summary charts #78238
Conversation
Pinging @elastic/ml-ui (:ml) |
…error handling with getToastNotificationService.
@elasticmachine merge upstream |
(model) => model.metadata?.analytics_config?.id === jobId | ||
); | ||
if ( | ||
Array.isArray(inferenceModel?.metadata?.total_feature_importance) === true && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - I don't think you really need the === true
after the isArray
check as the isArray
method is pretty clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reasons TS needs the explicit check here to make the second part of the if statement valid. Not sure why but I will leave it for now if that's okay.
Great addition! Code LGTM overall - just want to give it a good local test. Will do first thing but feel free to merge if you get the other green checks before then. |
...lytics_exploration/components/total_feature_importane_summary/feature_importance_summary.tsx
Outdated
Show resolved
Hide resolved
...lytics_exploration/components/total_feature_importane_summary/feature_importance_summary.tsx
Outdated
Show resolved
Hide resolved
...lytics_exploration/components/total_feature_importane_summary/feature_importance_summary.tsx
Outdated
Show resolved
Hide resolved
...lytics_exploration/components/total_feature_importane_summary/feature_importance_summary.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just left a couple of extra comments.
...ytics_exploration/components/total_feature_importance_summary/feature_importance_summary.tsx
Outdated
Show resolved
Hide resolved
...ytics_exploration/components/total_feature_importance_summary/feature_importance_summary.tsx
Show resolved
Hide resolved
<span> | ||
<FormattedMessage | ||
id="xpack.ml.dataframe.analytics.exploration.featureImportanceSummaryTitle" | ||
defaultMessage="Feature importance summary" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this heading to Overall feature importance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here 6a3c0d8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's "total_feature_importance" in the API should it match as "Total feature importance" in the title here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ance-summary # Conflicts: # x-pack/plugins/ml/public/application/data_frame_analytics/pages/analytics_exploration/components/classification_exploration/classification_exploration.tsx # x-pack/plugins/ml/public/application/data_frame_analytics/pages/analytics_exploration/components/exploration_page_wrapper/exploration_page_wrapper.tsx # x-pack/plugins/ml/public/application/data_frame_analytics/pages/analytics_exploration/components/regression_exploration/regression_exploration.tsx
Great work overall! I noticed one thing with the multi-class chart: I seems that one isn't sorted properly on the overall value of all classes, you can also see it in the screenshot of this PRs description. |
summary: ClassificationTotalFeatureImportance | RegressionTotalFeatureImportance | ||
): summary is RegressionTotalFeatureImportance { | ||
return (summary as RegressionTotalFeatureImportance).importance !== undefined; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have feature_importance.ts
in the same directory, can we move all additions to this file to feature_importances.ts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here e18bc41
analysisType === ANALYSIS_CONFIG_TYPE.CLASSIFICATION || | ||
analysisType === ANALYSIS_CONFIG_TYPE.REGRESSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In plugins/ml/common/util/analytics_utils.ts
we have isClassificationAnalysis
and isRegressionAnalysis
which we could use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated it to using the util functions here e18bc41
</EuiFlexGroup> | ||
</div> | ||
<Chart | ||
className="story-chart" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this className
was copied from the examples and is not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that. Updated here e18bc41
You're right @walterra. I made some modifications here da48b1a to sort the class by the overall sum mean_magnitude. It should look like this now: I also noticed the bars aren't always stacked from largest to smallest from left to right. This seems to be a chart issue so I'm going to raise an issue with the team to look into that. Thanks again for catching that. |
Great update! I think the chart is fine now as is, the order of colors should always be the same for each stacked bar. |
} | ||
|
||
// sort from largest importance at top to smallest importance at bottom | ||
sortedData = sortedData.sort((a, b) => b.meanImportance - a.meanImportance); | ||
// sortedData = sortedData.sort((a, b) => b.meanImportance - a.meanImportance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess this can be cleaned up, remove the commented code and move the text comment above where the sorting is now happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I removed it here b73325b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest changes LGTM ⚡
} | ||
|
||
// only show legend if it's a multiclass | ||
const _showLegend = classificationType === 'multiclass_classification'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For classification jobs where the numbers of classes is higher than the number of features, I am seeing some truncation of the legend labels. Can we fix this with styling on our side, or does it need a fix on the chart?
I also see this with a job on the mushroom data set with this config:
{
"id": "mushroom_cap_color_classification2",
"description": "10 top feature importance values",
"source": {
"index": [
"mushroom"
],
"query": {
"match_all": {}
}
},
"dest": {
"index": "mushroom_cap_color_classification2",
"results_field": "ml"
},
"analysis": {
"classification": {
"dependent_variable": "cap-color",
"num_top_feature_importance_values": 10,
"class_assignment_objective": "maximize_minimum_recall",
"num_top_classes": 2,
"prediction_field_name": "cap-color_prediction",
"training_percent": 8,
"randomize_seed": 2195820625972847400
}
},
"analyzed_fields": {
"includes": [
"bruises",
"cap-color",
"cap-shape",
"cap-surface",
"edibility",
"gill-attachment",
"gill-color",
"gill-size",
"gill-spacing",
"habitat",
"odor",
"population",
"ring-number",
"ring-type",
"spore-print-color",
"stalk-color-above-ring",
"stalk-color-below-ring",
"stalk-root",
"stalk-shape",
"stalk-surface-above-ring",
"stalk-surface-below-ring",
"veil-color",
"veil-type"
],
"excludes": []
},
"model_memory_limit": "41mb",
"create_time": 1601285932889,
"version": "8.0.0",
"allow_lazy_start": false,
"max_num_threads": 1
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it to give some buffer spacing here 17cfd1d
Unfortunately it's not dynamic so if the label texts are short, there will be some space on the right and vice versa it will still be truncated if the texts are long. Let me know if you're good with the change! Or if there's anything we can improve as far the legend goes, we can do it in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…icer [ML] Update legend for multi class total feature importance to look nicer
@elasticmachine merge upstream |
💚 Build SucceededMetrics [docs]@kbn/optimizer bundle module count
@kbn/ui-shared-deps asset size
async chunks size
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested latest changes and LGTM
Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Elastic Machine <[email protected]>
Summary
Part of #76039. This PR adds the feature importance summary charts to the
Binary classification
Multi-class classification
Regression
Checklist
Delete any items that are not applicable to this PR.