Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust ConstValue::Slice to work for arbitrary slice types #115870

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

RalfJung
Copy link
Member

valtrees have already been assuming that this works; this PR makes it a reality. Also further restrict ConstValue::Slice to what it is actually used for; this even shrinks ConstValue from 32 to 24 bytes which is a nice win. :)

The alternative to this approach is to make ConstValue::Slice work really only for &str/&[u8] literals, and never return it in op_to_const. That would make op_to_const very clean. We could then even remove the meta field; the length would always be data.inner().len(). We could almost just use a Symbol instead of a ConstAllocation, but we have to support byte strings and there doesn't seem to be an interned representation of them (or rather, ConstAllocation is their interned representation). In this world, valtrees of slice reference types would then become noticeably more expensive to turn into a ConstValue -- but does that matter? Specifically for &str/&[u8] we could still use the optimized representation if we wanted.

If byte strings were already interned somewhere I'd gravitate towards the alternative, but the way things stand, we need a ConstAllocation case anyway to support byte strings, and then we might as well support arbitrary slices. (Or we say that byte strings don't get an optimized representation at all. Such a performance cliff between str and byte strings is probably unexpected, though due to the lack of interning for byte strings I think there might already be a performance cliff there.)

@rustbot
Copy link
Collaborator

rustbot commented Sep 15, 2023

r? @davidtwco

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 15, 2023
@rustbot
Copy link
Collaborator

rustbot commented Sep 15, 2023

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

This PR changes Stable MIR

cc @oli-obk, @celinval, @spastorino

The Miri subtree was changed

cc @rust-lang/miri

@RalfJung
Copy link
Member Author

r? @oli-obk

@rustbot rustbot assigned oli-obk and unassigned davidtwco Sep 15, 2023
.iconst(fx.pointer_type, i64::try_from(end.checked_sub(start).unwrap()).unwrap());
let ptr = pointer_for_allocation(fx, alloc_id).get_addr(fx);
// FIXME: the `try_from` here can actually fail, e.g. for very long ZST slices.
let len = fx.bcx.ins().iconst(fx.pointer_type, i64::try_from(meta).unwrap());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjorn3 how does one turn an arbitrary "target usize" into a clif value? Going via i64 could overflow...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung
Copy link
Member Author

I wonder if the size reduction gives us a measurable speedup.
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 15, 2023
@bors
Copy link
Contributor

bors commented Sep 15, 2023

⌛ Trying commit e989ec5 with merge 06e5097...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 15, 2023
adjust constValue::Slice to work for arbitrary slice types

valtrees have already been assuming that this works; this PR makes it a reality. Also further restrict `ConstValue::Slice` to what it is actually used for; this even shrinks `ConstValue` from 32 to 24 bytes which is a nice win. :)

The alternative to this approach is to make `ConstValue::Slice` work really only for `&str`/`&[u8]` literals, and never return it in `op_to_const`. That would make `op_to_const` very clean. We could then even remove the `meta` field; the length would always be `data.inner().len()`. We could *almost* just use a `Symbol` instead of a `ConstAllocation`, but we have to support byte strings and there doesn't seem to be an interned representation of them (or rather, `ConstAllocation` *is* their interned representation). In this world, valtrees of slice reference types would then become noticeably more expensive to turn into a `ConstValue` -- but does that matter? Specifically for `&str`/`&[u8]` we could still use the optimized representation if we wanted.

If byte strings were already interned somewhere I'd gravitate towards the alternative, but the way things stand, we need a `ConstAllocation` case anyway to support byte strings, and then we might as well support arbitrary slices. (Or we say that byte strings don't get an optimized representation at all. Such a performance cliff between `str` and byte strings is probably unexpected, though due to the lack of interning for byte strings I think there might already be a performance cliff there.)
@RalfJung RalfJung changed the title adjust constValue::Slice to work for arbitrary slice types adjust ConstValue::Slice to work for arbitrary slice types Sep 15, 2023
@bors
Copy link
Contributor

bors commented Sep 15, 2023

☀️ Try build successful - checks-actions
Build commit: 06e5097 (06e50979de2aa96e49ce7185493e9dd6314ba866)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (06e5097): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.0% [-1.0%, -1.0%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.0% [-2.0%, -2.0%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.0% [-0.0%, -0.0%] 17
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 6
All ❌✅ (primary) -0.0% [-0.0%, -0.0%] 17

Bootstrap: 633.484s -> 630.971s (-0.40%)
Artifact size: 318.18 MiB -> 318.13 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 15, 2023
data: ConstAllocation<'tcx>,
/// The metadata field of the reference.
/// This is a "target usize", so we use `u64` as in the interpreter.
meta: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we even need this anymore? It can be computed from the ConstAllocation's size and the element type from the constant's type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variant is only interesting for MIR building in check builds. For codegen builds we'll end-up creating the AllocId anyway.
Meawhile, we still have an asymmetry between the variants of ConstValue and the variants we have in interpreter and codegen (Uninit, Scalar, ScalarPair, Indirect).

What about:

The "initial" representation by MIR building would be a ConstantKind::ByteSlice(&[u8]), and would get normalized to a ScalarPair for/before codegen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be computed from the ConstAllocation's size and the element type from the constant's type

Not for ZST slices...

Copy link
Member Author

@RalfJung RalfJung Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "initial" representation by MIR building would be a ConstantKind::ByteSlice(&[u8]), and would get normalized to a ScalarPair for/before codegen.

I would expect that to have the same perf impact as normalizing to a ScalarPair during MIR building (or even worse if it's done post-mono). I tried that (by putting an AllocId into ConstValue::Slice), it didn't end well. So I think this is not going to be fast enough.

Copy link
Member Author

@RalfJung RalfJung Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meawhile, we still have an asymmetry between the variants of ConstValue and the variants we have in interpreter and codegen (Uninit, Scalar, ScalarPair, Indirect).

Yeah that is totally deliberate. We used to have ConstValue::ScalarPair and moved away from it. I think moving back towards ScalarPair would be a mistake. We have no reason to permit an optimized representation of &dyn Trait or (i8, bool) or whatever, as fully evaluated constants.

You are pre-supposing that we want to have symmetry between ConstValue and OpTy, but that's not the case; those types serve very different roles with different trade-offs.

I think long-term we want to remove ConstValue entirely. See #115877.

@rustbot
Copy link
Collaborator

rustbot commented Sep 18, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

This PR changes Stable MIR

cc @oli-obk, @celinval, @spastorino

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

The Miri subtree was changed

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Sep 19, 2023

☔ The latest upstream changes (presumably #115865) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 20, 2023

r=me modulo the cranelift question

@RalfJung
Copy link
Member Author

The cranelift question is pre-existing though, the old code already did i64::try_from(end_minus_start).unwrap(). So this PR doesn't make things any worse, and I think we can leave the FIXME for @bjorn3 to resolve one day.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 20, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Sep 20, 2023

📌 Commit ea22adb has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 20, 2023
@bors
Copy link
Contributor

bors commented Sep 20, 2023

⌛ Testing commit ea22adb with merge 9da3e81...

@bors
Copy link
Contributor

bors commented Sep 20, 2023

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 9da3e81 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 20, 2023
@bors bors merged commit 9da3e81 into rust-lang:master Sep 20, 2023
@rustbot rustbot added this to the 1.74.0 milestone Sep 20, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9da3e81): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.4% [-0.5%, -0.2%] 4
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.6% [2.6%, 2.6%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-5.1% [-5.1%, -5.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.2% [-5.1%, 2.6%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.0% [-0.0%, -0.0%] 18
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 6
All ❌✅ (primary) -0.0% [-0.0%, -0.0%] 18

Bootstrap: 632.239s -> 633.382s (0.18%)
Artifact size: 317.84 MiB -> 317.82 MiB (-0.01%)

@RalfJung RalfJung deleted the const-value-slice branch September 21, 2023 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants