-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid string validation in rustc_serialize, check a marker byte instead #91407
Conversation
59968fa
to
5bd72a0
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 5bd72a00734a32ba84c4e23330deed42eb5891fa with merge ae738380b413f7adb5c0254e2c1dc1430fb4e8dd... |
☀️ Try build successful - checks-actions |
Queued ae738380b413f7adb5c0254e2c1dc1430fb4e8dd with parent 207c80f, future comparison URL. |
Finished benchmarking commit (ae738380b413f7adb5c0254e2c1dc1430fb4e8dd): comparison url. Summary: This change led to large relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
5bd72a0
to
20daf6d
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 20daf6d16f9431fde26c0587836867402aa71017 with merge cc72756a7ee0284c9fa1dd21416a46ecd8f46455... |
☀️ Try build successful - checks-actions |
Queued cc72756a7ee0284c9fa1dd21416a46ecd8f46455 with parent 772d51f, future comparison URL. |
Finished benchmarking commit (cc72756a7ee0284c9fa1dd21416a46ecd8f46455): comparison url. Summary: This change led to very large relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
r? rust-lang/compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great find -- thanks, @the8472!
I left a couple of comments below.
since the serialization format isn't self-describing we need a way to detect when encoder and decoder don't match up. but that doesn't have to be utf8 validation for strings, which does cost a few % of performance. Instead we can use a marker byte at the end to be reasonably sure that we're dealing with a string and it wasn't overwritten in some way.
20daf6d
to
c640f31
Compare
Very nice The perf run shows a single regression in @rustbot label: +perf-regression-triaged |
📌 Commit c640f31 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (477fd70): comparison url. Summary: This change led to large relevant improvements 🎉 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
rust-lang#91407 changed the serialization format which leads to ICEs for nightly users such as rust-lang#91663 and linked issue. Bumping the metadata version should lead to the cached files being discarded instead.
…Mark-Simulacrum Bump rmeta version to fix rustc_serialize ICE rust-lang#91407 changed the serialization format which leads to ICEs for nightly users such as rust-lang#91663 and linked issues. The issue can be solved by running `cargo clean`. But bumping the metadata version should lead to the cached files being discarded, avoiding the issue entirely.
Since the serialization format isn't self-describing we need a way to detect when encoder and decoder don't match up. But for strings it doesn't have to be utf8 validation, which currently does cost a few percent of performance.
Instead we can use a marker byte at the end to be reasonably sure that we're dealing with a string and it wasn't overwritten in some way.