-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fork tdigest library #95903
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Pinging @elastic/es-delivery (Team:Delivery) |
@martijnvg @wchaparro what's the priority on this. Is the upgrade to 3.3 blocking something that is targeting a particular Elasticsearch release? |
@mark-vieira We're not blocked by other items. The forking has a high priority for us and we like to get this in with the 8.9 release. We plan to fork it as a library under the top-level The integration with big arrays is currently not a blocker. We do like to adopt big arrays in the forked library, but that will be done at a later point of time and has a lower priority than the forking itself. |
The asserts were misfiring in this test: ``` REPRODUCE WITH: ./gradlew ':server:test' --tests "org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviationTests.testReduceRandom" -Dtests.seed=AA1D81AD056870F0 -Dtests.locale=en-CA -Dtests.timezone=US/Eastern -Druntime.java=20 org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviationTests > testReduceRandom FAILED java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([AA1D81AD056870F0:6A202DBC2335E9A6]:0) at org.elasticsearch.tdigest.MergingDigest.merge(MergingDigest.java:316) at org.elasticsearch.tdigest.MergingDigest.mergeNewValues(MergingDigest.java:298) at org.elasticsearch.tdigest.MergingDigest.mergeNewValues(MergingDigest.java:288) at org.elasticsearch.tdigest.MergingDigest.quantile(MergingDigest.java:485) at org.elasticsearch.tdigest.HybridDigest.quantile(HybridDigest.java:141) at org.elasticsearch.search.aggregations.metrics.TDigestState.quantile(TDigestState.java:247) at org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviation.computeMedianAbsoluteDeviation(InternalMedianAbsoluteDeviation.java:38) at org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviation.<init>(InternalMedianAbsoluteDeviation.java:48) at org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviationTests.createTestInstance(InternalMedianAbsoluteDeviationTests.java:33) at org.elasticsearch.search.aggregations.metrics.InternalMedianAbsoluteDeviationTests.createTestInstance(InternalMedianAbsoluteDeviationTests.java:23) ``` They should have been removed earlier, it's possible for the first and the last centroid to have weight > 1. Related to #95903
This was replaced by optional serialization in version V_8_500_018 (elastic#96943), breaking backwards compatibility for serverless. Related to elastic#95903
Setting `experiment_hint` to `high_accuracy` leads to using a slower but more accurate implementation for TDigest. Add YAML tests to cover for settings this param properly and for parsing errors. Related to #95903
This is now complete. |
We plan to for the tdigest library.
There are two main reasons behind this choice:
The immediate goal of this issue is to fork the library and then at a later stage enhance to forked library to make use of the big arrays infrastructure.
Currently t-digest version 3.2 is used. The current version is 3.3 We have been locked to the 3.2 version because of at least one breaking change (how p50 is computed). The plan is to fork from the latest commit and change the forked library such that the results it produces are similar to that of version 3.2.
The text was updated successfully, but these errors were encountered: