Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsafe: ScopedKey allows for Sync-ification of non-Sync data #25894

Closed
Manishearth opened this issue May 30, 2015 · 15 comments · Fixed by #25952
Closed

unsafe: ScopedKey allows for Sync-ification of non-Sync data #25894

Manishearth opened this issue May 30, 2015 · 15 comments · Fixed by #25952
Labels
P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Manishearth
Copy link
Member

There is an unsafe impl<T> Sync for KeyInner<T> { } in the scoped TLS module, which allows for sharing of data that wasn't meant to be shared.

An unqualified unsafe impl<T> Sync for Foo { } should only be used when Foo provides some sort of threadsafe locking mechanism. KeyInner and ScopedKey do not.

This lets us, for example, clone an Rc or RefCell between threads, in safe code:

scoped_thread_local!(static RC: Rc<u8>);
fn main() {
 RC.set(&Rc::new(1), ||{
     let y = &RC;
     let guard = scoped(|| {
        println!("Started 1");
        y.with(|slot|{
            for i in 1..10 {
                println!("Cloned in 1");
                slot.clone();
                sleep_ms(10);
            }
        });
        println!("Ended 1");
     });
     println!("Started 0");
     y.with(|slot| {
            for i in 1..10 {
                println!("Cloned in 0");
                slot.clone();
                sleep_ms(10);
            }
     });
     println!("Ended 0");
     guard.join();
 });
}

playpen

Here, we have Rc clones from different threads being interleaved, which can cause unsafety if there's a race on the refcount. We could also cause unsafety with a RefCell, or any other non-Sync type. ScopedKey basically lets us share &T across threads even if T: !Sync. I put the sleeps in there because somehow without them the routines are run serially.

Note that this is not a problem with the scoped() API. I'm using scoped() here because to exploit &T: Send, T:!Sync one needs to be able to share across threads, and Arc has an additional Send bound which ScopedKey doesn't satisfy. Any usable scoped API will probably have to make the assumption that &T: Send iff T:Sync. Nor is this a problem with Rc, since it can be done with RefCell, too.

Solution: Make it unsafe impl<T: Sync> Sync for KeyInner<T> { }

... I think. I'm not familiar with this API, and I'm not sure why it's even Sync in the first place.

@Manishearth
Copy link
Member Author

@Gankra Gankra added I-nominated P-high High priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 30, 2015
Manishearth added a commit to Manishearth/rust that referenced this issue May 30, 2015
Manishearth added a commit to Manishearth/rust that referenced this issue May 30, 2015
@huonw
Copy link
Member

huonw commented May 30, 2015

cc @alexcrichton

This seems concerning: this is designed to be thread-local, so that e.g. storing a Cell works (which would be safe, if it were guaranteed to stay on one thread). However static ...: T currently requires T: Sync unconditionally, hence this overly-accepting impl.

@Manishearth
Copy link
Member Author

Qualifying the impl with T: Sync breaks scoped TLS for Cell (see the associated PR); because ScopedKey becomes Sync no longer and thus can't be stuffed inside a static.

@eddyb
Copy link
Member

eddyb commented May 30, 2015

I believe the only way #[thread_local] statics can be safe is by requiring the absence of Sync.
That may sound odd, but consider the following:

  • T: Sync => &T: Send
  • sending thread-local references across threads is memory unsafe even with no interior mutability
  • a #[thread_local] static with no interior mutability is almost entirely useless, certainly not a common case
  • non-Sync containers offering interior mutabilities are cheaper and preferred to Sync ones, where they can be used

The main issue is actually within the libstd TLS implementation: LocalKey is a wrapper for getters so it's unaffected by it, but ScopedKey may live in a #[thread_local] static or a global one, depending on whether the platform natively supports TLS or not.

Send-ing a &'static ScopedKey across threads is actually safe on the architectures which do not have native TLS support.
However, allowing it on some architectures and not on others is undesirable.

Should ScopedKey use an accessor model like LocalKey?
Then the actual borrow of a #[thread_local] static is an implementation detail and happens on the thread where the key is used, not where it was itself borrowed.

@pythonesque
Copy link
Contributor

"I believe the only way #[thread_local] statics can be safe is by requiring the absence of Sync."

This is wrong, I think. For example, one might use thread local data for intrusive linked lists of hazard pointers. They would be thread local but might still be Sync (one can guarantee existence with scoped threads or in certain circumstances where threads are bound to cores). That said you can obviously come up with other representations. In any case, I've mostly come around to believing that we need a separate thread_local lifetime (not just for this, but for things like ensuring that catch_panic can't catch an error in a thread that changes thread local data, violating exception safety).

@Kimundi
Copy link
Member

Kimundi commented May 30, 2015

Based on what @eddyb wrote, it seems like the TLS interface requires a static that only allows non-Sync data and is still accessible like one.

One idea I had of how to enable this in todays Rust is to make scoped_thread_local!() generate a unit like struct instead of a static. That struct would be !Sync and have a Deref impl:

// Today:
RC.set(...); // autoref of static "RC" lvalue because:

scoped_thread_local!(static RC: Rc<u8>);
// = expands to =>
static RC: ScopedKey<Rc<u8>> = ...;

// With this change:
RC.set(...); // autoref of local temporay "RC" rvalue because

scoped_thread_local!(static RC: Rc<u8>);
// = expands to =>
struct RC;
impl Deref<Target=Rc<u8>> for RC { 
    static mut ... // can have a !Sync in a static mut
    ... 
}
impl !Sync for RC {}

According to @eddyb, this could also help with LocalKey, but I'm not familiar enough with the TLS API in general to really reason about this.

@eddyb
Copy link
Member

eddyb commented May 30, 2015

@Kimundi I was expecting the Deref impl to target an intermediary type, not the contents themselves.
ScopedKey and LocalKey have their own APIs to deal with.

@pythonesque You would still be about to use the TLS API provided by libstd with thread-safe data, it's just the low-level #[thread_local] which would have the requirement.

Ideas of thread-local lifetimes predate the removal of 'static from Send, and I'm not sure how relevant they are anymore.
I'm also conflating thread-safe data access and thread lifetime... but how long does a thread live?
What would &'thread_local T unify with?

Actually, there's an... easy solution for this. Somehow we were blinded by language semantics, but their details are fluid.
It is trivial to make borrows of #[thread_local] behave like borrows of an arbitrary rvalue, allowing the resulting reference to be passed down, but not returned.

That would ensure that the the thread is just as alive for such borrows as it is for stack ones.
ScopedKey would still require a facade like LocalKey has, because it may not be Sync when native TLS is supported.
But both their APIs are compatible with such a change.

@pythonesque
Copy link
Contributor

Wouldn't that limit the usability of thread locals in some cases, though? e.g., wouldn't that make it difficult to have a thread local point to another thread local? Maybe I just have to think about it.

(BTW, I'm aware we're just talking about the low-level API :)).

@eddyb
Copy link
Member

eddyb commented May 30, 2015

@pythonesque in a static, borrows are &'static, I didn't even think of that.
First off, a global static shouldn't be allowed to borrow a #[thread_local] static.
And in general, having borrows in a #[thread_local] is going to be difficult to support, except for non-Sync data, because they would be 'static right now.

We could disallow Sync borrows or we could give some kind of scope to those borrows.

Which makes me think, as a compromise between a thread_local lifetime and 'static, there could be a thread_local pseudo-(rvalue)-scope.
Combined with the ability to leave out 'static in globals' type (or not having a type at all), the scope should bubble up into users of those statics.

They would behave as if they were borrowed before everything else in the function where they're used, but still not 'static (and they can't be returned from functions).

@alexcrichton
Copy link
Member

There's a few intertwined issues here at play, so I just want to quickly summarize the way I see things:

I'm not sure I follow why we might require the types in a #[thread_local] are not Sync, I haven't quite followed that. Due the cross-platform requirements of std, however, I believe the best solution here is to use the same strategy as LocalKey with an indirect accessor to ensure the semantics are the same across platforms that have #[thread_local] and those that don't (where a normal static is used).

@alexcrichton
Copy link
Member

triage: P-medium

This is an unstable API so it's not super-high priority to fix, but we should defnitely fix!

@rust-highfive rust-highfive added P-medium Medium priority and removed P-high High priority I-nominated labels Jun 1, 2015
@eddyb
Copy link
Member

eddyb commented Jun 1, 2015

@alexcrichton requiring that a #[thread_local] is not Sync is an easy way to ensure that a borrow of it, while 'static, cannot escape the current thread.

I've changed my mind and prefer an rvalue borrow semantic for #[thread_local], but it still is the case that LocalKey and ScopedKey use non-Sync #[thread_local] statics (ignoring the workarounds for #18001).

If #[thread_local] was supported everywhere, having rvalue borrow semantics wouldn't hurt the std TLS API (aside from stability issues with LocalKey), as the access patterns are already through anonymous lifetimes (references passed to closures).

Maybe a combined strategy, where borrows of #[thread_local] statics are 'static iff the data is not Sync (and rvalue-like otherwise), might work better and would require fewer changes to the std TLS implementation.

@alexcrichton
Copy link
Member

@alexcrichton requiring that a #[thread_local] is not Sync is an easy way to ensure that a borrow of it, while 'static, cannot escape the current thread.

Right, but returning a borrow with a 'static lifetime for a #[thread_local] global is unsafe for other reasons (the lifetime of the thread is not 'static).

@eddyb
Copy link
Member

eddyb commented Jun 1, 2015

@alexcrichton how can it be abused without sending it to a different thread?

@alexcrichton
Copy link
Member

how can it be abused without sending it to a different thread?

The actual values themselves do not have a 'static lifetime, so it can be a source of unsafety to advertise otherwise. For example you could stick a &'static T into a TLS value with a destructor, but when the destructor runs that pointer could be invalid.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Jun 1, 2015
Currently the compiler has no knowledge of `#[thread_local]` which forces users
to take on two burdens of unsafety:

* The lifetime of the borrow of a `#[thread_local]` static is **not** `'static`
* Types in `static`s are required to be `Sync`

The thread-local modules mostly curb these facets of unsafety by only allowing
very limited scopes of borrows as well as allowing all types to be stored in a
thread-local key (regardless of whether they are `Sync`) through an `unsafe
impl`.

Unfortunately these measures have the consequence of being able to take the
address of the key itself and send it to another thread, allowing the same key
to be accessed from two different threads. This is clearly unsafe, and this
commit fixes this problem with the same trick used by `LocalKey`, which is to
have an indirect function call to find the address of the *current thread's*
thread local. This way the address of thread local keys can safely be sent among
threads as their lifetime truly is `'static`.

This commit will reduce the performance of cross-crate scoped thread locals as
it now requires an indirect function call, but this can likely be overcome in a
future commit.

Closes rust-lang#25894
alexcrichton added a commit to alexcrichton/rust that referenced this issue Jun 1, 2015
Currently the compiler has no knowledge of `#[thread_local]` which forces users
to take on two burdens of unsafety:

* The lifetime of the borrow of a `#[thread_local]` static is **not** `'static`
* Types in `static`s are required to be `Sync`

The thread-local modules mostly curb these facets of unsafety by only allowing
very limited scopes of borrows as well as allowing all types to be stored in a
thread-local key (regardless of whether they are `Sync`) through an `unsafe
impl`.

Unfortunately these measures have the consequence of being able to take the
address of the key itself and send it to another thread, allowing the same key
to be accessed from two different threads. This is clearly unsafe, and this
commit fixes this problem with the same trick used by `LocalKey`, which is to
have an indirect function call to find the address of the *current thread's*
thread local. This way the address of thread local keys can safely be sent among
threads as their lifetime truly is `'static`.

This commit will reduce the performance of cross-crate scoped thread locals as
it now requires an indirect function call, but this can likely be overcome in a
future commit.

Closes rust-lang#25894
bors added a commit that referenced this issue Jun 16, 2015
Currently the compiler has no knowledge of `#[thread_local]` which forces users
to take on two burdens of unsafety:

* The lifetime of the borrow of a `#[thread_local]` static is **not** `'static`
* Types in `static`s are required to be `Sync`

The thread-local modules mostly curb these facets of unsafety by only allowing
very limited scopes of borrows as well as allowing all types to be stored in a
thread-local key (regardless of whether they are `Sync`) through an `unsafe
impl`.

Unfortunately these measures have the consequence of being able to take the
address of the key itself and send it to another thread, allowing the same key
to be accessed from two different threads. This is clearly unsafe, and this
commit fixes this problem with the same trick used by `LocalKey`, which is to
have an indirect function call to find the address of the *current thread's*
thread local. This way the address of thread local keys can safely be sent among
threads as their lifetime truly is `'static`.

This commit will reduce the performance of cross-crate scoped thread locals as
it now requires an indirect function call, but this can likely be overcome in a
future commit.

Closes #25894
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants