Skip to content

Commit

Permalink
docs: extend documentation on Sync and thread-safety (#4695)
Browse files Browse the repository at this point in the history
* guide: extend documentation on `Sync` and thread-safety

* Update guide/src/class/thread-safety.md

Co-authored-by: Alex Gaynor <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bruno Kolenbrander <[email protected]>
Co-authored-by: Nathan Goldbaum <[email protected]>

* threadsafe -> thread-safe

* datastructure -> data structure

* fill out missing sections

* remove dead paragraph

* fix guide build

---------

Co-authored-by: Alex Gaynor <[email protected]>
Co-authored-by: Bruno Kolenbrander <[email protected]>
Co-authored-by: Nathan Goldbaum <[email protected]>
  • Loading branch information
4 people authored Nov 15, 2024
1 parent 71100db commit e7ec730
Show file tree
Hide file tree
Showing 10 changed files with 283 additions and 153 deletions.
2 changes: 1 addition & 1 deletion guide/pyclass-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
| `str` | Implements `__str__` using the `Display` implementation of the underlying Rust datatype or by passing an optional format string `str="<format string>"`. *Note: The optional format string is only allowed for structs. `name` and `rename_all` are incompatible with the optional format string. Additional details can be found in the discussion on this [PR](https://github.com/PyO3/pyo3/pull/4233).* |
| `subclass` | Allows other Python classes and `#[pyclass]` to inherit from this class. Enums cannot be subclassed. |
| <span style="white-space: pre">`text_signature = "(arg1, arg2, ...)"`</span> | Sets the text signature for the Python class' `__new__` method. |
| `unsendable` | Required if your struct is not [`Send`][params-3]. Rather than using `unsendable`, consider implementing your struct in a threadsafe way by e.g. substituting [`Rc`][params-4] with [`Arc`][params-5]. By using `unsendable`, your class will panic when accessed by another thread. Also note the Python's GC is multi-threaded and while unsendable classes will not be traversed on foreign threads to avoid UB, this can lead to memory leaks. |
| `unsendable` | Required if your struct is not [`Send`][params-3]. Rather than using `unsendable`, consider implementing your struct in a thread-safe way by e.g. substituting [`Rc`][params-4] with [`Arc`][params-5]. By using `unsendable`, your class will panic when accessed by another thread. Also note the Python's GC is multi-threaded and while unsendable classes will not be traversed on foreign threads to avoid UB, this can lead to memory leaks. |
| `weakref` | Allows this class to be [weakly referenceable][params-6]. |

All of these parameters can either be passed directly on the `#[pyclass(...)]` annotation, or as one or
Expand Down
1 change: 1 addition & 0 deletions guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
- [Basic object customization](class/object.md)
- [Emulating numeric types](class/numeric.md)
- [Emulating callable objects](class/call.md)
- [Thread safety](class/thread-safety.md)
- [Calling Python from Rust](python-from-rust.md)
- [Python object types](types.md)
- [Python exceptions](exception.md)
Expand Down
10 changes: 7 additions & 3 deletions guide/src/class.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ The above example generates implementations for [`PyTypeInfo`] and [`PyClass`] f

### Restrictions

To integrate Rust types with Python, PyO3 needs to place some restrictions on the types which can be annotated with `#[pyclass]`. In particular, they must have no lifetime parameters, no generic parameters, and must implement `Send`. The reason for each of these is explained below.
To integrate Rust types with Python, PyO3 needs to place some restrictions on the types which can be annotated with `#[pyclass]`. In particular, they must have no lifetime parameters, no generic parameters, and must be thread-safe. The reason for each of these is explained below.

#### No lifetime parameters

Expand Down Expand Up @@ -119,9 +119,13 @@ create_interface!(IntClass, i64);
create_interface!(FloatClass, String);
```

#### Must be Send
#### Must be thread-safe

Because Python objects are freely shared between threads by the Python interpreter, there is no guarantee which thread will eventually drop the object. Therefore all types annotated with `#[pyclass]` must implement `Send` (unless annotated with [`#[pyclass(unsendable)]`](#customizing-the-class)).
Python objects are freely shared between threads by the Python interpreter. This means that:
- Python objects may be created and destroyed by different Python threads; therefore #[pyclass]` objects must be `Send`.
- Python objects may be accessed by multiple python threads simultaneously; therefore `#[pyclass]` objects must be `Sync`.

For now, don't worry about these requirements; simple classes will already be thread-safe. There is a [detailed discussion on thread-safety](./class/thread-safety.md) later in the guide.

## Constructor

Expand Down
108 changes: 108 additions & 0 deletions guide/src/class/thread-safety.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# `#[pyclass]` thread safety

Python objects are freely shared between threads by the Python interpreter. This means that:
- there is no control which thread might eventually drop the `#[pyclass]` object, meaning `Send` is required.
- multiple threads can potentially be reading the `#[pyclass]` data simultaneously, meaning `Sync` is required.

This section of the guide discusses various data structures which can be used to make types satisfy these requirements.

In special cases where it is known that your Python application is never going to use threads (this is rare!), these thread-safety requirements can be opted-out with [`#[pyclass(unsendable)]`](../class.md#customizing-the-class), at the cost of making concurrent access to the Rust data be runtime errors. This is only for very specific use cases; it is almost always better to make proper thread-safe types.

## Making `#[pyclass]` types thread-safe

The general challenge with thread-safety is to make sure that two threads cannot produce a data race, i.e. unsynchronized writes to the same data at the same time. A data race produces an unpredictable result and is forbidden by Rust.

By default, `#[pyclass]` employs an ["interior mutability" pattern](../class.md#bound-and-interior-mutability) to allow for either multiple `&T` references or a single exclusive `&mut T` reference to access the data. This allows for simple `#[pyclass]` types to be thread-safe automatically, at the cost of runtime checking for concurrent access. Errors will be raised if the usage overlaps.

For example, the below simple class is thread-safe:

```rust
# use pyo3::prelude::*;

#[pyclass]
struct MyClass {
x: i32,
y: i32,
}

#[pymethods]
impl MyClass {
fn get_x(&self) -> i32 {
self.x
}

fn set_y(&mut self, value: i32) {
self.y = value;
}
}
```

In the above example, if calls to `get_x` and `set_y` overlap (from two different threads) then at least one of those threads will experience a runtime error indicating that the data was "already borrowed".

To avoid these errors, you can take control of the interior mutability yourself in one of the following ways.

### Using atomic data structures

To remove the possibility of having overlapping `&self` and `&mut self` references produce runtime errors, consider using `#[pyclass(frozen)]` and use [atomic data structures](https://doc.rust-lang.org/std/sync/atomic/) to control modifications directly.

For example, a thread-safe version of the above `MyClass` using atomic integers would be as follows:

```rust
# use pyo3::prelude::*;
use std::sync::atomic::{AtomicI32, Ordering};

#[pyclass(frozen)]
struct MyClass {
x: AtomicI32,
y: AtomicI32,
}

#[pymethods]
impl MyClass {
fn get_x(&self) -> i32 {
self.x.load(Ordering::Relaxed)
}

fn set_y(&self, value: i32) {
self.y.store(value, Ordering::Relaxed)
}
}
```

### Using locks

An alternative to atomic data structures is to use [locks](https://doc.rust-lang.org/std/sync/struct.Mutex.html) to make threads wait for access to shared data.

For example, a thread-safe version of the above `MyClass` using locks would be as follows:

```rust
# use pyo3::prelude::*;
use std::sync::Mutex;

struct MyClassInner {
x: i32,
y: i32,
}

#[pyclass(frozen)]
struct MyClass {
inner: Mutex<MyClassInner>
}

#[pymethods]
impl MyClass {
fn get_x(&self) -> i32 {
self.inner.lock().expect("lock not poisoned").x
}

fn set_y(&self, value: i32) {
self.inner.lock().expect("lock not poisoned").y = value;
}
}
```

### Wrapping unsynchronized data

In some cases, the data structures stored within a `#[pyclass]` may themselves not be thread-safe. Rust will therefore not implement `Send` and `Sync` on the `#[pyclass]` type.

To achieve thread-safety, a manual `Send` and `Sync` implementation is required which is `unsafe` and should only be done following careful review of the soundness of the implementation. Doing this for PyO3 types is no different than for any other Rust code, [the Rustonomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html) has a great discussion on this.
2 changes: 2 additions & 0 deletions guide/src/free-threading.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ annotate Python modules declared by rust code in your project to declare that
they support free-threaded Python, for example by declaring the module with
`#[pymodule(gil_used = false)]`.

More complicated `#[pyclass]` types may need to deal with thread-safety directly; there is [a dedicated section of the guide](./class/thread-safety.md) to discuss this.

At a low-level, annotating a module sets the `Py_MOD_GIL` slot on modules
defined by an extension to `Py_MOD_GIL_NOT_USED`, which allows the interpreter
to see at runtime that the author of the extension thinks the extension is
Expand Down
Loading

0 comments on commit e7ec730

Please sign in to comment.