-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adjust ConstValue::Slice to work for arbitrary slice types #115870
Conversation
r? @davidtwco (rustbot has picked a reviewer for you, use r? to override) |
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri This PR changes Stable MIR cc @oli-obk, @celinval, @spastorino The Miri subtree was changed cc @rust-lang/miri |
r? @oli-obk |
.iconst(fx.pointer_type, i64::try_from(end.checked_sub(start).unwrap()).unwrap()); | ||
let ptr = pointer_for_allocation(fx, alloc_id).get_addr(fx); | ||
// FIXME: the `try_from` here can actually fail, e.g. for very long ZST slices. | ||
let len = fx.bcx.ins().iconst(fx.pointer_type, i64::try_from(meta).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjorn3 how does one turn an arbitrary "target usize" into a clif value? Going via i64
could overflow...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this in bjorn3/rustc_codegen_cranelift@cb55ce1
I wonder if the size reduction gives us a measurable speedup. |
This comment has been minimized.
This comment has been minimized.
adjust constValue::Slice to work for arbitrary slice types valtrees have already been assuming that this works; this PR makes it a reality. Also further restrict `ConstValue::Slice` to what it is actually used for; this even shrinks `ConstValue` from 32 to 24 bytes which is a nice win. :) The alternative to this approach is to make `ConstValue::Slice` work really only for `&str`/`&[u8]` literals, and never return it in `op_to_const`. That would make `op_to_const` very clean. We could then even remove the `meta` field; the length would always be `data.inner().len()`. We could *almost* just use a `Symbol` instead of a `ConstAllocation`, but we have to support byte strings and there doesn't seem to be an interned representation of them (or rather, `ConstAllocation` *is* their interned representation). In this world, valtrees of slice reference types would then become noticeably more expensive to turn into a `ConstValue` -- but does that matter? Specifically for `&str`/`&[u8]` we could still use the optimized representation if we wanted. If byte strings were already interned somewhere I'd gravitate towards the alternative, but the way things stand, we need a `ConstAllocation` case anyway to support byte strings, and then we might as well support arbitrary slices. (Or we say that byte strings don't get an optimized representation at all. Such a performance cliff between `str` and byte strings is probably unexpected, though due to the lack of interning for byte strings I think there might already be a performance cliff there.)
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (06e5097): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 633.484s -> 630.971s (-0.40%) |
data: ConstAllocation<'tcx>, | ||
/// The metadata field of the reference. | ||
/// This is a "target usize", so we use `u64` as in the interpreter. | ||
meta: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we even need this anymore? It can be computed from the ConstAllocation
's size and the element type from the constant's type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variant is only interesting for MIR building in check builds. For codegen builds we'll end-up creating the AllocId anyway.
Meawhile, we still have an asymmetry between the variants of ConstValue
and the variants we have in interpreter and codegen (Uninit, Scalar, ScalarPair, Indirect).
What about:
- making this a variant in
mir::ConstantKind
instead? aConstantKind::ByteSlice(&'tcx [u8])
? - introducing a
ScalarPair
variant inConstValue
? Example in WIP Introduce ConstValue::ScalarPair #115915 for the latter.
The "initial" representation by MIR building would be a ConstantKind::ByteSlice(&[u8])
, and would get normalized to a ScalarPair
for/before codegen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be computed from the ConstAllocation's size and the element type from the constant's type
Not for ZST slices...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "initial" representation by MIR building would be a ConstantKind::ByteSlice(&[u8]), and would get normalized to a ScalarPair for/before codegen.
I would expect that to have the same perf impact as normalizing to a ScalarPair during MIR building (or even worse if it's done post-mono). I tried that (by putting an AllocId
into ConstValue::Slice
), it didn't end well. So I think this is not going to be fast enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meawhile, we still have an asymmetry between the variants of ConstValue and the variants we have in interpreter and codegen (Uninit, Scalar, ScalarPair, Indirect).
Yeah that is totally deliberate. We used to have ConstValue::ScalarPair
and moved away from it. I think moving back towards ScalarPair would be a mistake. We have no reason to permit an optimized representation of &dyn Trait
or (i8, bool)
or whatever, as fully evaluated constants.
You are pre-supposing that we want to have symmetry between ConstValue
and OpTy
, but that's not the case; those types serve very different roles with different trade-offs.
I think long-term we want to remove ConstValue
entirely. See #115877.
Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri This PR changes Stable MIR cc @oli-obk, @celinval, @spastorino Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri The Miri subtree was changed cc @rust-lang/miri Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
This comment has been minimized.
This comment has been minimized.
6e21222
to
995233f
Compare
☔ The latest upstream changes (presumably #115865) made this pull request unmergeable. Please resolve the merge conflicts. |
995233f
to
ea22adb
Compare
r=me modulo the cranelift question |
The cranelift question is pre-existing though, the old code already did |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (9da3e81): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 632.239s -> 633.382s (0.18%) |
valtrees have already been assuming that this works; this PR makes it a reality. Also further restrict
ConstValue::Slice
to what it is actually used for; this even shrinksConstValue
from 32 to 24 bytes which is a nice win. :)The alternative to this approach is to make
ConstValue::Slice
work really only for&str
/&[u8]
literals, and never return it inop_to_const
. That would makeop_to_const
very clean. We could then even remove themeta
field; the length would always bedata.inner().len()
. We could almost just use aSymbol
instead of aConstAllocation
, but we have to support byte strings and there doesn't seem to be an interned representation of them (or rather,ConstAllocation
is their interned representation). In this world, valtrees of slice reference types would then become noticeably more expensive to turn into aConstValue
-- but does that matter? Specifically for&str
/&[u8]
we could still use the optimized representation if we wanted.If byte strings were already interned somewhere I'd gravitate towards the alternative, but the way things stand, we need a
ConstAllocation
case anyway to support byte strings, and then we might as well support arbitrary slices. (Or we say that byte strings don't get an optimized representation at all. Such a performance cliff betweenstr
and byte strings is probably unexpected, though due to the lack of interning for byte strings I think there might already be a performance cliff there.)