Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Implement Multi-Phase Module Initialization as per PEP 489 #4162

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Aequitosh
Copy link

@Aequitosh Aequitosh commented May 6, 2024

RFC: Multi-Phase Module Initialization as per PEP 489

This PR implements multi-phase initialization in PyO3 as a drop-in replacement for single phase initialization. While it still needs some polish here and there, the implementation is mostly complete.

What's still left to do:

  • Figure out how to add submodules again - needs some kind of support for creating m_slots or perhaps an API around creating PyModuleDefs
    • This works now, thanks to a mechanism that makes it possible to (de-facto) statically allocate m_slots at runtime.
  • Reliably get a Py<PyModule> from a *mut ffi::PyModuleDef
    • Is this even possible? If not, we'll require some API breaking, I fear
  • Improve error handling - currently panics if calling the module setup function (#ident(#(#module_args),*) in proc macros) fails
  • Implement multi-phase initialization for PyModule::new_bound
    • I wonder if this is actually necessary, though.
  • Address remaining TODOs
  • Rephrase commit messages / perhaps split up commits
    • The messages could IMO be somewhat more detailed, elaborating on the reasoning behind certain changes / additions

Please let me know what you think! I'm grateful for any feedback! :)

Specifically, I'd be very happy to receive insight / comments on the TODOs I've added here and there.

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Jun 7, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
@Aequitosh Aequitosh force-pushed the pyo3-pep-489-multi-phase-init branch from 02eb845 to 909b49f Compare July 18, 2024 14:48
This module contains `ModuleState`, a struct used to represent state
that lies on the per-module memory region of a Python module, as well
as various exported functions that are used to integrate `ModuleState`
with the Python interpreter.

This commit does not implement any further functionality (such as
actually representing any kind of state) and is instead more of a
blank slate for further additions.

Signed-off-by: Max R. Carrara <[email protected]>
These helpers and wrapper structs ensure that any slots and function
pointers used during multi-phase initialization stay allocated
throughout the remaining lifetime of the program.

Signed-off-by: Max R. Carrara <[email protected]>
This commit adds experimental but fully functional support for
multi-phase module initialization as specified in PEP 489.

Note that this commit serves as a demonstration & basis for further
improvements only; many tests have therefore not been adapted
correspondingly yet.

Signed-off-by: Max R. Carrara <[email protected]>
@Aequitosh Aequitosh force-pushed the pyo3-pep-489-multi-phase-init branch from 909b49f to a1beab4 Compare July 23, 2024 15:27
@Aequitosh Aequitosh changed the title Draft: Implement Multi-Phase Module Initialization as per PEP 489 RFC: Implement Multi-Phase Module Initialization as per PEP 489 Jul 23, 2024
@Aequitosh
Copy link
Author

Upgraded this PR from a draft to an RFC. The description was also updated to reflect that.

@a-reich
Copy link

a-reich commented Oct 25, 2024

Just curious, any progress getting this reviewed?

@Aequitosh
Copy link
Author

Just curious, any progress getting this reviewed?

I'm planning to rebase this on main as soon as I got time; perhaps then someone's gonna have a look.

@davidhewitt
Copy link
Member

Yes, I had a lot to think about with freethreading and also our changes for IntoPyObject. I want to come back to this after releasing 0.23.

@davidhewitt davidhewitt added this to the 0.24 milestone Oct 25, 2024
@Aequitosh
Copy link
Author

Yes, I had a lot to think about with freethreading and also our changes for IntoPyObject. I want to come back to this after releasing 0.23.

Sounds good! I've been quite busy myself, so it's all good; seems like the timing is just right.

pecastro pushed a commit to pecastro/ceph that referenced this pull request Nov 19, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
@Sunnatillo
Copy link

Hi @Aequitosh. Thank you for working on this issue.

How can I help to get this forward?

@epuertat
Copy link

@Aequitosh I'm just curious, as a workaround, would cleaning-up/freeing a PyO3 module (m_clear, m_free callbacks) would resolve this issue?

Something like?:

from cryptography.fernet import Fernet   # a version that relies on this version of PyO3

key = Fernet.generate_key()

del cryptography
del sys.modules['cryptography']
gc.collect()

If that fully removes all module state, we could have a context manager like this in sub-interpreters:

with _import('cryptography.fernet') as Fernet:
    key = Fernet.generate_key()

What do you think?

@Aequitosh
Copy link
Author

Hi @Aequitosh. Thank you for working on this issue.

How can I help to get this forward?

Hey @Sunnatillo, thank you for offering your help! I'm actually intending on rebasing my branch on main today and then make a little write-up / plan on what's left to do. There you'll then find everything I'd need help with.

I really wanna get this ball rolling again, so I'm grateful for any help.

I'm just curious, as a workaround, would cleaning-up/freeing a PyO3 module (m_clear, m_free callbacks) would resolve this issue?

Hi @epuertat! This unfortunately won't resolve things, because PyO3 modules are (as of time of writing) loaded using single-phase initialisation. The callbacks you mentioned are currently not used; there's no actual module state being managed at the moment, because native modules based on PyO3 are "global" (due to the fact of them using the older single-phase init mechanism). Glossing over some details here, but that's essentially what the situation is right now.

This PR aims to replace the single-phase init with the newer multi-phase initialisation, which would then (almost) automatically make PyO3-based Python modules sound when used with sub-interpreters. The remaining static data that PyO3 stores would also need to be moved to the per-module memory location in PyModuleDef.

So, unfortunately, your solution most likely won't work (or rather, it won't be safe).

Relevant reads, if you're curious:


With all that being said, expect some updates / changes to trickle in (hopefully) soon.

@epuertat
Copy link

Thanks a lot @Aequitosh for your quick & thorough reply!

pecastro pushed a commit to pecastro/ceph that referenced this pull request Nov 24, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
pecastro pushed a commit to pecastro/ceph that referenced this pull request Dec 8, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
pecastro pushed a commit to pecastro/ceph that referenced this pull request Dec 9, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
pecastro pushed a commit to pecastro/ceph that referenced this pull request Dec 11, 2024
The latest versions of pyO3 (a Python binding for Rust) explicitly added
a check to detect multiple imports. In subinterpreter environments, this
leads to an ImportError: "PyO3 modules may only be initialized once per
interpreter process" (it has been recenlty replaced with a more
specific: "PyO3 modules may only be initialized once per interpreter
process".

This is only a workaround while the root cause is fixed (see
PyO3/pyo3#4162).

Fixes: https://tracker.ceph.com/issues/64213
Signed-off-by: Ernesto Puerta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants