-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immovable types #1858
Immovable types #1858
Conversation
As I mentioned in #1853 (comment) this might be enough to have types that zero themselves on drop correctly, which should improve doing cryptography in Rust. In particular, this improves using stack variable in cryptography, so crates can more easily support Right now, you can zero a reference as done in https://crates.io/crates/clear_on_drop but you cannot zero anything that moves. I've some train wreck branches of that crate that demonstrate this, like https://github.com/burdges/clear_on_drop/blob/owned_fail/src/owned.rs That said, there are reasons for moving things, so maybe wants a built-in trait that goes beyond just being a market, like
All types implement I'd think both |
@burdges that looks like a move constructor. Not having move constructors is one of rust's selling points. |
I see, so anything that zeros really seamlessly should be baked deeper into the compiler. There are maybe still parallel limitations in LLVM for both immovable types and self zeroing types though. |
👍 This seems like it represents my original idea pretty well, thank you! The only thing I might want to add is that one way you might think about it is in terms of "observing the address" - which requires a borrow. Before that first borrow, nothing knows the address, therefore moves can't break third-party accesses. This also leads to self/existential-borrows as being an excellent way to encode these types (sadly it'd have to be phantom for types like mutexes where a library or the OS is observing the address). But the RFC in its current form is great for generators that borrow across a Well, you'd want cc @tomaka |
One small potential problem is that code like this will not accept fn foo<T: SomeTrait>(t: &T) {} A lot of code like this exists in the wild, and I guess it will need to be changed for EDIT: To be fair the code above already doesn't accept |
Would |
You can still move unsized values with unsafe code so I wouldn't bet on being able to do that. |
If accepted, this RFC resolves this two year old issue. Good discussion in there that is relevant to this RFC. |
Making |
text/0000-immovable-types.md
Outdated
[motivation]: #motivation | ||
|
||
Interacting with C/C++ code often require data that cannot change its location in memory. To work around this we allocate such data on the heap. For example the standard library `Mutex` type allocates a platform specific mutex on the heap. This prevents the use of `Mutex` in global variables. If we add immovable types, we can have an alternative immovable mutex type `StaticMutex` which we could store in global variables. If the lifetime of the mutex is limited to a lexical scope, we could also have a `StaticMutex` in the stack frame and avoid the allocation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained in #1858 (comment), focusing on mutexes might not be relevant anymore, and I'm not aware of other specific C APIs that need this sort of thing - usually you use an opaque pointer and don't even know the size of the allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parking lots require allocations. It is not unreasonable to have a mutex which does not allocate.
I may drop the note on interacting with C/C++. I haven't come up with any cases either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parking_lot
only allocates memory once at startup. Creating, destroying, locking and unlocking a mutex does not allocate memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be useful if for some reason you need to bind a C++ class type by value, as C++ objects generally don't expect to be silently moved. In particular, this applies if you want to subclass a C++ class from Rust (in which case the class has to be embedded at the beginning of the struct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One example from boost is offset_ptr
which stores the pointer as an offset from this
. The idea is to be able to put containers and other data structures in shared memory and have it work even if the memory gets mapped to a different address.
text/0000-immovable-types.md
Outdated
- Trait objects are not `Move` by default | ||
|
||
A new marker struct `ImmovableData` is also introduced in `core::marker`. This struct does not implement `Move` and allows users to make composite immovable types. `PhantomData` should be extended to accept `?Move` types, but `PhantomData` itself should always implement `Move`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PhantomData
rule might not be the best choice, although all usecases of PhantomData
I can think of involve indirection, so they wouldn't hide any capability. I suppose if we need "!Move
if T: !Move
" we can do:
struct MoveIffMove<T: ?Sized+?Move>(ImmovableData, PhantomData<T>);
unsafe impl<T: ?Sized+Move> Move for MoveIffMove<T> {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have such implementation for built-in traits?
The reason for this rule is to allow PhantomData<T>
inBox<T:? Move>
(which is needed for dropck), while keeping Box
movable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, OIBITs have some precedent here. And if we can do it at all then Box
could just do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eddyb why is Move
an unsafe impl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ubsan You could cause unsafety by implementing Move
for a type that contains a !Move
type. Moving the outer type would cause the inner type to be moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cramertj oh, you're right. thanks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g.
struct MoveWhatever<T: ?Move>(T);
impl<T: ?Move> Move for MoveWhatever<T> {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eddyb We could put PhantomData
inside MobileCell
to ensure it won't cause Box
to be immovable
I don't think I entirely understand how this is supposed to work e.g. for StaticMutex... the RFC uses the example |
@RalfJung I presume the mutex only gets initialised on first use, at which point the address is fixed. edit: That doesn't really work for a mutex though... Maybe you would need to explicitly initialise it via a method taking |
For pthreads, we'd initialize it with PTHREAD_MUTEX_INITIALIZER, which is just plain data and doesn't need any code to run. |
So, why not make EDIT: Durr, I didn't notice the comment from 10 days ago saying the exact same thing. |
I don't think that makes sense. Even if unsafe code can do something with references to specific unsized types that's logically equivalent to moving, that doesn't mean the unsized types themselves need to be |
(Sorry to spam, but if I edit then it won't show up in email notifications...) If However, making traits support immovable types by default would be quite powerful. What if instead of having a separate That would be an abuse of terminology, but think about it. Depending on |
@comex There is code that makes the assumption that all |
Huh, I didn't know there was a generic way to get the size of values of unsized types. That definitely kills my idea then. (I'm not sure what you mean about |
@comex You'd be able to take DSTs from one place, and put them onto the stack in another place. |
The semantics of what types impl |
@withoutboats An |
With #1909 in the works, I'm pretty sure |
text/0000-immovable-types.md
Outdated
Changing these associated types will be insta-stable. You would be unable to write stable code which would conflict with this proposal. `?Move` bounds would also show up in documentation, although we would be able to filter those out if desired. | ||
|
||
# How We Teach This | ||
[how-we-teach-this]: #how-we-teach-this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation RFC pedantry hat on here. 🤓
This section could use to be expanded a bit. Specifically, if accepted this RFC would need to include updates:
- to the standard library where the types are implemented
- to the Reference, under 9. Special Traits
- (probably) to the book in the FFI section
I also think the hand-wave that "the concept is likely familiar to users of C, C++, and C FFIs" is insufficient. This is something which people writing FFIs for the first time will need a place to learn, and we can't assume that everyone writing an FFI has prior C or C++ experience. There are many, and an increasing number of, Rust users who are coming to Rust wanting to do FFI for e.g. Python or Ruby and who haven't done it before in other contexts.
That said, discussions of the immovability in the C, C++, and FFI contexts might be really helpful prior art—it could possibly be cited, and it should certainly be considered as useful background for whoever writes the docs if this is accepted!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to that, despite my experience with C++, I never really had a clear concept of "immovable type" until the need for them arose in the recent Rust coroutine discussions. The closest concept I think I had when doing just C++ was the rather muddy "things that can be invalidated if this other thing moves around, like iterators and their containers, so don't move that other thing until you're done with the things that can be invalidated".
I've read the RFC, and I do like the idea that a
I'm not clear how this would work - borrows of More generally, even if Just to throw out an alternative design, let's say moves fall into three categories - parameters, return values and assignment (not sure if there are more?). By forbidding all three (similar to (please let me know if anything above seems misinformed) |
How does bindgen wrap constructors? If it provides at least the option to run them directly on some already-allocated memory then matching that would probably be the closest we can get. |
As a name alternative to |
That version of extern "C++" {
class NonMoveType;
fn NonMoveType() -> NonMoveType;
}
fn example() -> Box<NonMoveType> {
let n = NonMoveType();
// this moves `n` into a box.
box n
} If you want things to work, you'll need to use |
I updated the RFC to better align with my PR. |
- The `Immovable` type is never `Move` | ||
- Trait objects are `Move` if they have an explicit `Move` bound | ||
- Struct, enums and tuples are `Move` if all their elements are `Move` | ||
- Existential types (`impl Trait`) are `Move` if their underlying type are `Move` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this needs to be explicit. If you have:
struct Bar;
impl Foo for Bar {}
fn baz() -> impl Foo {
Bar
}
And then change baz()
to:
impl Foo for Immovable {}
fn baz() -> impl Foo {
Immovable
}
Code that assumes that baz()
returns a movable value will break.
In order to return an existantial + immovable struct, you should have to specify it as being ?Move
:
struct Bar;
impl Foo for Bar {}
impl Foo for Immovable {}
fn baz() -> impl Foo + ?Move {
Immovable
}
(I haven't used existantial types, I don't know if impl Foo + Baz
works).
The other solution would be to have all existantial types be ?Move
by default, which wouldn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this example/use case isn't straightforward because returning / passing an argument is a move. But it could interact with placement-new.
|
||
A new marker struct `Immovable` is also introduced in `core::marker`. This struct does not implement `Move` and allows users to make composite immovable types. | ||
|
||
You can freely move values which are known to implement `Move` after they are borrowed, however you cannot move types which aren't known to implement `Move` after they have been borrowed. Once we borrow an immovable type, we'd know its address and code should be able to rely on the address not changing. This is sound since the only way to observe the address of a value is to borrow it. Before the first borrow nothing can observe the address and the value can be moved around. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how useful this is - the main motivation for this RFC is self-referencing generators (or self-referencing structs in general, I assume), and this doesn't apply to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful for interacting with libraries in other languages that store pointers to Rust values that would be invalidated by those values being moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nbaraz This is the most important part of this RFC. This is the part which enables immovable types to be moved. This is very useful for generators. In particular you can return generators from functions, since returning is a move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any self-referential structure would need to be immediately considered frozen, as it borrows itself from the beginning of its own existence, so the "until it is borrowed" clause would come into effect immediately.
This section does concern me for its implications in rental
, however. It's not at all clear to me what exact conditions a value is to be considered "observed". If a function accepts a ?Move
type by value, we could presume that it hasn't bee observed yet otherwise passing it would have been illegal in the first place, but what about immovable types behind a Box or Arc. Are they considered implicitly frozen when accepted as arguments or when returned from a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any self-referential structure would need to be immediately considered frozen, as it borrows itself from the beginning of its own existence, so the "until it is borrowed" clause would come into effect immediately.
This is not possible in Rust though (even with this RFC). You can only borrow values after they have been constructed.
It's not at all clear to me what exact conditions a value is to be considered "observed".
One way to determine this is to ask if you can get a pointer to it in safe Rust. Since Box
implements Deref
we can call Deref::deref
to obtain a pointer to the inner value, thus the inner value must be counted as observed once its placed in a Box
. It is also possible to have some other MovableBox
type, which do not allow access to the inner value, except by moving it out again.
In general, we can allow immovable types in an movable container if we either, disallow all methods of accessing the address of the contained immovable types or prevent the type from actually moving once it's inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not possible in Rust though (even with this RFC). You can only borrow values after they have been constructed.
Right, I mean in a theoretical future where rust can express a self-referential type, it would need to be aware that it's in an immediate state of borrowed-ness.
Since Box implements Deref we can call Deref::deref to obtain a pointer to the inner value, thus the inner value must be counted as observed once its placed in a Box.
Excellent, that's what I was hoping for. To be able to add support for immovables to rental, I need to ensure that anything I return is already considered observed and frozen, since rust itself is unaware of the self-referentiality and might otherwise allow the type to move when it shouldn't.
Still poses some challenges with how to allow the user to select which container the value should be constructed into, but ATCs are likely the best solution for that.
|
||
Interacting with C/C++ code may require data that cannot change its location in memory. To work around this we allocate such data on the heap. For example the standard library `Mutex` type allocates a platform specific mutex on the heap. This prevents the use of `Mutex` in global variables. If we add immovable types, we can have an alternative immovable mutex type `StaticMutex` which we could store in global variables. If the lifetime of the mutex is limited to a lexical scope, we could also have a `StaticMutex` in the stack frame and avoid the allocation. | ||
|
||
The key motivation for this proposal is to allow generators to have "stack frames" which do not move in memory. The ability to take references to local variables rely on those variable being static in memory. If a generator is moved, the local variables contained inside also move, which invalidates references to them. So references to local variables stored inside the generator cannot be allowed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another motivation could be types whose destructors must run - If it is on the stack and can't move from there, can safe code prevent its destructor from running?
This would also require a way to mark types as only place-able on the stack, but immovability is still a requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may still place your "unleakable" type inside a wrapper type which may not run destructors and cause it to leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained it more clearly here.
The basic idea is adding a way to mark types as immovable + only place-able on the stack. This way they cannot be placed inside a wrapper to be leaked. If the wrapper is completely on the stack (no heap pointer), the unleakable type is not leaked (since !Move
propagates from members to containing structs). Since it cannot be placed in a heap pointer, RC's and such can't be used to leak them.
I don't know if such a marker can be created/enforced.
|
||
## Immovable types contained in movable types | ||
|
||
To allow immovable types to be contained in movable types, we introduce a `core::cell::MovableCell` wrapper which itself implements `Move`. It works similarly to `Cell` in that it disallows references to the value inside. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a scenario where this is actually safe - If a type assumes that it doesn't move in memory because something depends on its address, moving it will always break something.
I think that it makes sense for immovable types to only live on the stack or behind pointers, which are always Move
, and for instances to be created with placement-new.
Maybe add an ExplicitMove
trait, which implemented for every struct whose members implement ExplicitMove (by default not implemented for immpvable types)? I think that this can be addressed in another RFC.
- Struct, enums and tuples are `Move` if all their elements are `Move` | ||
- Existential types (`impl Trait`) are `Move` if their underlying type are `Move` | ||
- `[T]` and `[T; n]` are `Move` if `T` is `Move` | ||
- `str` is `Move` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably add "borrowed and mutable references are always Move
" for completeness.
JoinGuard, anyone? |
What is the status on this? I've seen there is an internals thread talking about it, but, from what I can tell, we aren't getting any closer to some solution here. I just ran into this issue, and while my problems could be mitigated my macros, it would have been great to have language support for this. |
Given where things are headed in this space, I'm going to propose to postpone this RFC, in favor of library-based solutions for the time being. @rfcbot fcp postpone |
Team member @aturon has proposed to postpone this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
I expect/hope to fully replace As I mentioned in my reddit comment, you can't use The only language extension I can think of is a new Then again, we could also just make the library types special to get all of these niceties! |
Oh that’s too bad this gets postponed. It seems to be a very exciting idea! <3 |
@phaazon See #1858 (comment) - we may be able to come up with an improvement over |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period is now complete. |
Closing since FCP with a motion to postpone is now complete. |
This RFC introduces a design for immovable types based on this idea by @eddyb.
Given that a large motivation of this RFC is to support immovable generators, I suggest that we avoid closing this until we either define a suitable design for immovable generators or find a better approach to immovable types.
Rendered