Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display of integers without raw pointers #135265

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

pascaldekloe
Copy link

The benchmarks as is measure formatting speed of literals. The first commit black_box-es input to simulate runtime speed instead.

The second commit replaces unsafe pointer optimizations with plain array indices. The performance is equivalent on Apple M1. Needs peer review on Intel.

Happy to do the 128-bit version too if such change is welcome.

@rustbot
Copy link
Collaborator

rustbot commented Jan 8, 2025

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Noratrieb (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 8, 2025
@pascaldekloe pascaldekloe changed the title Fmt int speed Display of integers without unsafe pointer Jan 8, 2025
@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

r? tgross35 since I'm doing some work in this area anyway

You should cd src/etc/test-float-parse and run cargo +stage0 run --release, those tests are technically for parsing but do invoke the print routines quite a bit. I'll take a closer look at the changes soon.

@rustbot rustbot assigned tgross35 and unassigned Noratrieb Jan 8, 2025
@pascaldekloe
Copy link
Author

No problem if this one gets rejected in favour of anything else @tgross35. I was just pleased to see the compiler doing this well.

@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

You should cd src/etc/test-float-parse and run cargo +stage0 run --release, those tests are technically for parsing but do invoke the print routines quite a bit. I'll take a closer look at the changes soon.

Actually ignore this, I was thinking this was in flt2dec and not integers.

No problem if this one gets rejected in favour of anything else @tgross35. I was just pleased to see the compiler doing this well.

No conflict :) Replacing unsafe code is always nice if there are no drawbacks.

@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

It looks like there may be some small regressions here, the new code doesn't elide a couple calls to panic_bounds_check https://rust.godbolt.org/z/hjqhqsnbb. You could probably play around on godbolt a bit to figure out where this is coming from, a call to hint::assert_unchecked in the right place would likely make this disappear.

Soon it will be possible to write_copy_of_slice on the MaybeUninit slices, I don't expect this to do anything for optimization but it would allow keeping the "write two bytes at a time" feature from the original #129259.

@tgross35 tgross35 changed the title Display of integers without unsafe pointer Display of integers without raw pointer Jan 10, 2025
@tgross35 tgross35 changed the title Display of integers without raw pointer Display of integers without raw pointers Jan 10, 2025
@pascaldekloe
Copy link
Author

What a gem that Compiler Explorer is @tgross35. I think the assembly looks good now.

B.t.w., I tried bit shifting entries from the lookup table into a 64-bit register and write per 8 digits at once. The compiler "optimizes" such intermediate register away, and it goes for 16-bit copies instead. Maybe those small writes are actually faster. I don't know. 😂

@pascaldekloe
Copy link
Author

I found a way without compiler hinting which is even faster than the unsafe code as is. It requires buffers of an even byte-size. Needs a little more work for all integer sizes.

https://github.com/pascaldekloe/b10/blob/e7d9813c1f74dc9137216f1b271303fb056e90ad/src/lib.rs#L1723

@tgross35
Copy link
Contributor

What a gem that Compiler Explorer is @tgross35.

I am by no means a perf expert but I still can't imagine chasing down optimizations without it :)

I found a way without compiler hinting which is even faster than the unsafe code as is. It requires buffers of an even byte-size. Needs a little more work for all integer sizes.

https://github.com/pascaldekloe/b10/blob/e7d9813c1f74dc9137216f1b271303fb056e90ad/src/lib.rs#L1723

Are you planning to update the PR with something like this? At a surface level the patch seems pretty reasonable but I didn't look too deep yet, I'll hold off on reviewing if you have something faster coming.

@pascaldekloe
Copy link
Author

Yes, will give it a try @tgross35. Same surprise here as you're having. 🙂

@tgross35
Copy link
Contributor

@rustbot author

(just comment @rustbot review to update the labels when this is ready for a look)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 13, 2025
@pascaldekloe
Copy link
Author

The speed gain is because (1) the loop decision no longer depends on the digit calculation, (2) it produces fewer instructions, and (3) it no longer needs a check for special case zero, which has a "significant" leading zero.

Unfortunately the results are less optimal for the two benchmark cases in the core library: either zero or maximum digits in a loop. Apparently the branch prediction benefit does not outweight the per-4-digit tweak in such scenario.

So now we have the explicit assertions as hint assertions too. Can you have a look @tgross35? @rustbot review

https://rust.godbolt.org/z/h97MacKcc

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 13, 2025
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 14, 2025
@rustbot

This comment has been minimized.

@rustbot rustbot removed has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 14, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
     Running tests/lib.rs (obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-26031fed2a160b90)

running 1761 tests

thread 'net::socket_addr::ipv4_socket_addr_to_string' panicked at library/core/src/panicking.rs:218:5:
unsafe precondition(s) violated: hint::assert_unchecked must never be called when the condition is false
thread caused non-unwinding panic. aborting.
error: test failed, to rerun pass `-p core --test coretests`
Caused by:
Caused by:
  process didn't exit successfully: `/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-26031fed2a160b90 --quiet -Z unstable-options --format json` (signal: 6, SIGABRT: process abort signal)
  local time: Tue Jan 21 17:26:34 UTC 2025
  network time: Tue, 21 Jan 2025 17:26:35 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@pascaldekloe
Copy link
Author

Don't have Windows to figure out what's failing there. Leaving as is. 😐

@ChrisDenton
Copy link
Member

That doesn't look like a Windows error. It looks like it's detecting a bug in one of your uses of assert_unchecked.

thread 'net::socket_addr::ipv4_socket_addr_to_string' panicked at library/core/src/panicking.rs:218:5:
unsafe precondition(s) violated: hint::assert_unchecked must never be called when the condition is false

@bors
Copy link
Contributor

bors commented Jan 27, 2025

☔ The latest upstream changes (presumably #135937) made this pull request unmergeable. Please resolve the merge conflicts.

@pascaldekloe
Copy link
Author

That doesn't look like a Windows error. It looks like it's detecting a bug in one of your uses of assert_unchecked.

Understood. I did check all of the assertions and there's no way. Can't locally reproduce. Tests pass fine.

@tgross35
Copy link
Contributor

Maybe change assert_unchecked! to assert! temporarily? It may not be showing up locally due to a difference in config, and in any case assert! would at least give a better output message / backtrace.

// Four digits need a 16-bit $unsigned or wider.
#[allow(overflowing_literals)]
#[allow(unused_comparisons)]
while remain > 999 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suspicious. You've allowed overflowing_literals so 999_u8 == 231.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants