-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizations to nearest neighbor #35
Conversation
Change this from O(P::DIMENSIONS^2) multiplications to O(P::DIMENISIONS). Note this is not identical to previous results due to floating precision.
Use a stack based binary heap by default that only allocates on the heap after its stack size has been exhausted. This yields up to 1.3x speedup for repeated lookups in large trees.
Codecov Report
@@ Coverage Diff @@
## master #35 +/- ##
==========================================
- Coverage 98.22% 97.57% -0.65%
==========================================
Files 18 17 -1
Lines 1237 1239 +2
==========================================
- Hits 1215 1209 -6
- Misses 22 30 +8
Continue to review full report at Codecov.
|
Thanks a lot for your work on this! I like all the changes, the performance benefits speak for themselves. I wish there was a "SmallHeap" package that would not require to implement the heap "on the fly", but we'll do what we have to do!. Thanks a lot for upstreaming this! |
PR georust#35 changed the structure of floating point operations in this, which leads to precision issues and inconsistency with distance_2. This fixes the structure so that it is identical with results of the previous implementation of min_max_dist_2.
PR georust#35 changed the structure of floating point operations in this, which leads to precision issues and inconsistency with distance_2. This fixes the structure so that it is identical with results of the previous implementation of min_max_dist_2.
PR georust#35 changed the structure of floating point operations in this, which leads to precision issues and inconsistency with distance_2. This makes the order of floating point operations in min_max_dist_2 identical to length_2, particularly adding squares in the same dimension order. This resolves issues with discrepencies in distances due to non-associativity of floating point operations.
40: Aabb: fix min_max_dist_2 consistency with distance_2 r=urschrei a=aschampion PR #35 changed the structure of floating point operations in this, which leads to precision issues and inconsistency with distance_2. This fixes the structure so that it is identical with results of the previous implementation of min_max_dist_2. Performance difference is negligible. These floating point differences compounded to significant changes in our data, so this is worth making consistent. This is also necessary for correctness of optimizations in a following PR. Co-authored-by: Andrew Champion <[email protected]>
Thanks for the great library! A few optimizations for our use case of querying hundreds of thousands of nodes in large (10-100K node) trees: