-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Student's t-test aggregation support #53692
Comments
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
Just to throw out another option, we could also have At first I was against having
Agreed, I think this would make adoption and usage a lot more difficult even though it's technically possible (and probably less code to maintain). Relying on the user to set it up correctly sounds fragile and error prone. I'd be ++ a metric, and maybe later determine if we want to add a pipeline equivalent (e.g. could be useful to compare bucket values rather than raw docs). |
After experimenting with parser a little bit and talking to @polyfractal we have decided to modify the syntax a bit and start with paired t-test implementation. The request will look like this:
The unpaired t-test will be implemented in a follow up PR and will look like this:
|
Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to elastic#53692
Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to #53692
Adds t_test metric aggregation that can perform paired and unpaired two-sample t-tests. In this PR support for filters in unpaired is still missing. It will be added in a follow-up PR. Relates to elastic#53692
Update version in the t-test agg usage stats serialization after backport to 7.8.0 Relates to elastic#53692
Update version in the t-test agg usage stats serialization after backport to 7.8.0 Relates to #53692
Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes elastic#53692
Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes #53692
Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes elastic#53692
I would like to discuss adding a multivalued metrics aggregation that will apply unpaired and paired two-sample t-tests to two samples selected based on filters or fields or a combination of both.
So, unpaired t-test might look like this:
The paired t-test might look something like this:
We can also add support for scripts.
The type of the test can be specified by the user with defaults based on the presence of absence of filters. We can support a
type
parameter that can be specified aspaired
(default and only supported if filters are not present),homoscedastic
(equal variance) orheteroscedastic
(unequal variance, default if filters are present.The output will be a typical metrics aggregation with t and p values.
Alternatively, we can implement this as a pipeline aggregation, but in this case it will simplify implementation, but might make usage a bit more difficult and can complicate kibana adoption. We can also consider implementing it as both pipeline and metric aggregation similar to stats.
cc: @jtibshirani, @polyfractal
The text was updated successfully, but these errors were encountered: