Standardized to usize #57

valarauca · 2024-01-24T21:18:49Z

Remove instances of i64 & isize when indexing, standardized on usize

I agree to follow the project's code of conduct.
I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Per #56 (comment)

Attempting to reducing the scope of the PR.

This only addresses moving indexing variables to usize.

Remove instances of i64 & isize when indexing, standardized on usize

michaelkirk · 2024-01-25T01:09:54Z

Could you run cargo fmt and fix the new clippy errors?

valarauca · 2024-01-25T01:30:54Z

Squashing locally broke because of the upstream commits, idk how to fix that.

Clippy demanded I do a lot of things outside of the scope of the original change

michaelkirk · 2024-01-25T01:32:29Z

src/geomath.rs

    const COEFF: [f64; 18] = [
        -1.0, 6.0, -16.0, 32.0, -9.0, 64.0, -128.0, 2048.0, 9.0, -16.0, 768.0, 3.0, -5.0, 512.0,
        -7.0, 1280.0, -7.0, 2048.0,
    ];
    let eps2 = sq(eps);
    let mut d = eps;
    let mut o = 0;
-    for l in 1..=geodesic_order {
+    for (l, v) in c.iter_mut().enumerate().take(geodesic_order + 1).skip(1) {


I'm not sure we can really consider this line an improvement in aggregate, but who am I to argue with a paperclip? 😆

michaelkirk · 2024-01-25T01:38:22Z

Hmmm, I'm actually seeing a perf regression with this PR

$ cargo bench --bench="*" --  --baseline=main-2024-01-24
direct (c wrapper)/default
                        time:   [23.918 µs 23.956 µs 24.004 µs]
                        change: [+0.2632% +0.4314% +0.5961%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

direct (rust impl)/default
                        time:   [30.104 µs 30.139 µs 30.180 µs]
                        change: [+0.2469% +0.4175% +0.5814%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

inverse (c wrapper)/default
                        time:   [44.610 µs 44.664 µs 44.723 µs]
                        change: [-2.4267% -0.9378% +0.0711%] (p = 0.16 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
  1 (1.00%) high severe

inverse (rust impl)/default
                        time:   [78.354 µs 78.452 µs 78.564 µs]
                        change: [+5.8279% +6.0227% +6.2028%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

If I revert the clippy/format commit (877afcc) I see both an improvement and a regression:

$ cargo bench --bench="*" --  --baseline=main-2024-01-24

direct (c wrapper)/default
                        time:   [23.948 µs 24.010 µs 24.088 µs]
                        change: [+0.1347% +0.3540% +0.5856%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

direct (rust impl)/default
                        time:   [29.044 µs 29.085 µs 29.126 µs]
                        change: [-3.2355% -3.0412% -2.8588%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

inverse (c wrapper)/default
                        time:   [44.673 µs 44.745 µs 44.820 µs]
                        change: [-2.3927% -0.9049% +0.1063%] (p = 0.19 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe

inverse (rust impl)/default
                        time:   [77.100 µs 77.238 µs 77.404 µs]
                        change: [+4.1364% +4.3271% +4.5071%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  6 (6.00%) high mild

Any idea why?

valarauca · 2024-01-25T05:30:55Z

If I revert the clippy/format commit (877afcc) I see both an improvement and a regression:
Any idea why?

My best guess is the change is within the noise threshold. I commonly see +/-3µs changes without a code change locally (even with n=1000).

I've taken steps to try to reduce this noise (closing applications, changing cpu throttling/power options) but it remains frustratingly common result.

michaelkirk · 2024-01-25T17:05:09Z

Noise is a real thing, but the consistency with which I can reproduce the relative behavior leads me to believe this is not only noise in the benchmark.

d8bbc65 - (HEAD) Standardized to usize (20 hours ago)

direct: -3% improvement vs main
inverse: +4% regression vs main

877afcc - (HEAD) Make cargo fmt & cargo clippy happy (16 hours ago)

no (or very small) change in direct vs main
+5% regression in inverse vs main

Are you able to reproduce this?

michaelkirk · 2024-01-25T17:32:21Z

The regressions from the clippy/fmt commit seem to be from changing the loop iteration.

e.g. this branch https://github.com/georust/geographiclib-rs/tree/mkirk/usize (as of 20a2b33) has the same perf characteristics as before your clippy changes.

That is:

direct: -3% improvement vs main
inverse: +4% regression vs main

michaelkirk · 2024-01-25T19:28:23Z

Thanks! I included this in #58 (but reverted part of the clippy changes).

valarauca · 2024-01-25T20:58:00Z

Are you able to reproduce this?

Sort of...

Looking at the ASM output link

The main different I see is the normal index based loop does a pretty simple structure

BELOW_18 -> POLY_SUM (inner loop) -> BELOW_18 OR  RET

While the clippy version does

 IDK -> BELOW_18 -> POLY_SUM (inner loop) -> IDK | RETURN_OR_PANIC

This makes the clippy version extremely branch heavy, as it seems to frequently branch off check a bunch of stuff, then return to the inner loop.

michaelkirk · 2024-01-26T00:50:11Z

Interesting! That might account for it.

valarauca and others added 2 commits January 24, 2024 13:16

Standardized to usize

d8bbc65

Remove instances of i64 & isize when indexing, standardized on usize

run ci on forks

5a28dc2

valarauca added 2 commits January 24, 2024 17:24

Make cargo fmt & cargo clippy happy

877afcc

Merge branch 'main' of https://github.com/valarauca/geographiclib-rs

94b8f1e

michaelkirk approved these changes Jan 25, 2024

View reviewed changes

michaelkirk mentioned this pull request Jan 25, 2024

Mkirk/usize #58

Merged

1 task

michaelkirk closed this Jan 25, 2024

michaelkirk mentioned this pull request Jan 25, 2024

favor type usize for variables that are primarily an array index #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardized to usize #57

Standardized to usize #57

valarauca commented Jan 24, 2024 •

edited

Loading

michaelkirk commented Jan 25, 2024

valarauca commented Jan 25, 2024

michaelkirk Jan 25, 2024

michaelkirk commented Jan 25, 2024 •

edited

Loading

valarauca commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

valarauca commented Jan 25, 2024

michaelkirk commented Jan 26, 2024

Standardized to usize #57

Standardized to usize #57

Conversation

valarauca commented Jan 24, 2024 • edited Loading

michaelkirk commented Jan 25, 2024

valarauca commented Jan 25, 2024

michaelkirk Jan 25, 2024

Choose a reason for hiding this comment

michaelkirk commented Jan 25, 2024 • edited Loading

valarauca commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

michaelkirk commented Jan 25, 2024

valarauca commented Jan 25, 2024

michaelkirk commented Jan 26, 2024

valarauca commented Jan 24, 2024 •

edited

Loading

michaelkirk commented Jan 25, 2024 •

edited

Loading