-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update the import machinery to only use __spec__ #65961
Comments
With PEP-451, Python 3.4 introduced module specs to encapsulate the module's import-related information, particularly for loading. While __loader__, __file__, and __cached__ are no longer used by the import machinery, in a few places it still uses __name__, __package__, and __path__. Typically the spec and the module attrs will have the same values, so it would be a non-issue. However, the import-related module attributes are not read-only and the consequences of changing them (i.e. accidentally or to rely on an implementation detail) are not clearly defined. Making the spec strictly authoritative reduces the likelihood accidental changes and gives a better focus point for a module's import behavior (which was kind of the point of PEP-451 in the first place). Furthermore, objects in sys.modules are not required to be modules. By relying strictly on __spec__ we likewise give a more distinct target (of import-related info) for folks that need to use that trick. I don't recall the specifics on why we didn't change those 3 attributes for PEP-451 (unintentional or for backward compatibility?). At one point we discussed the idea that a module's spec contains the values that *were* used to load the module. Instead, each spec became the image of how the import system sees and treats the module. So unless there's some solid reason, I'd like to see the use of __name__, __package__, and __path__ by the import machinery eliminated (and accommodated separately if appropriate). Consistent use of specs in the import machinery will help limit future surprises. Here are the specific places: __name__ mod.__repr__()
ExtensionFileLoader.load_module()
importlib._bootstrap._handle_fromlist()
importlib._bootstrap._calc___package__()
importlib._bootstrap.__import__() __package__ importlib._bootstrap._calc___package__() __path__ importlib._bootstrap._find_and_load_unlocked()
importlib._bootstrap._handle_fromlist()
importlib._bootstrap._calc___package__() __file__ mod.__repr__() Note that I'm not suggesting the module attributes be eliminated (they are useful for informational reasons). I would just like the import system to stop using them. I suppose they could be turned into read-only properties, but anything like that should be addressed in a separate issue. If we do make this change, the language reference, importlib docs, and inspect docs should be updated to clearly reflect the role of the module attributes in the import system. I have not taken into account the impact on the standard library. However, I expect that it will be minimal at best. (See issue bpo-21736 for a related discussion). |
Manipulating name, package and path at runtime is fully supported, and the There may be some test suite gaps and documentation issues around the |
Thanks for clarifying. I remembered discussing it but couldn't recall the details. Documenting the exact semantics, use cases, and difference between spec and module attrs would be help. I'll look into updating the language reference when I have some time. It would still be worth it to find a way to make __spec__ fully authoritative, but I'll have to come up with a solution to the current use cases before that could go anywhere. :) |
The spec is authoritative for "how was this imported?". The differences between that and the module attributes then provide a record of any post-import modifications. |
So I am going to disagree with Nick about the module attributes and their usefulness (partially because I just made __spec__.parent take precedence over __package__ in issue bpo-25791). While I get the idea of wanting a history of what did (not) change since import, keeping the duplicate information around is annoying. And I don't know how truly useful it is to know what things were compared to what they became. If we shift to preferring specs compared to module attributes we can then begin to clean up __import__ itself by no longer grabbing the globals() and locals() and instead simply pass in the module's __spec__ object. It also simplifies the documentation such that we don't have to explain everything twice. If people really want to track what a value was relating to import before mutation they can simply store it themselves instead of making us do the bookkeeping for them. It would also make things such as module_from_spec() or loader.create_module() simpler since they only have to worry about setting __spec__ instead of that attribute plus a bunch of other ones. |
My concern is more about backwards compatibility - at the moment, you can alter the behaviour of import, pickle, and other subsystems by modifying the module level attributes, and if we switch to preferring the __spec__ attributes, then that kind of code will break (I added an import specific example related to __main__ module relative imports to the linked issue). That's not to say it shouldn't be done - as you say, it would be nice to eventually get to a point where the import system only needs access to the module spec and not to the runtime state, and there are also cases where the __spec__ information will be more correct (e.g. pickling objects in __main__). However, it needs to be in such a way that there are appropriate porting notes that explain to people why their state mutations stopped having the desired effect, and what (if anything) they can do instead. |
I totally agree proper notes in the What's New doc need to be in place to explain that people need to update. How about I tweak the __package__ change to continue to prefer __package__ over __spec__.parent, but raise an ImportWarning when they differ? It can also fall back to __spec__.parent if __package__ isn't defined and simply not worry about the lack of __package__? Then we can do an ImportWarning for all of the other attributes when we discover a difference so people have time to migrate to updating both locations, and then eventually we can invert the priority and then after that drop the module attributes. |
That approach sounds good to me. The main problem cases I'm aware of are: __name__:
__path__:
__package__:
|
Yeah, which is why it will take a transition to get people to start mucking with spec instead of the module attributes for their legitimate/questionable needs (although the whole |
__package__ != __spec__.parent now triggers an ImportWarning. |
I think that leaves the following attributes to be updated/checked for dependencies in importlib (and if they are found, raise ImportWarning when they differ):
|
It turns out that the import system itself doesn't use |
Plan (as discussed with @warsaw ):
|
… from `__spec__.parent` Also remove `importlib.util.set_package()` which was already slated for removal.
…`__spec__.parent` (#97879) Also remove `importlib.util.set_package()` which was already slated for removal. Co-authored-by: Eric Snow <[email protected]>
* main: (66 commits) pythongh-65961: Raise `DeprecationWarning` when `__package__` differs from `__spec__.parent` (python#97879) docs(typing): add "see PEP 675" to LiteralString (python#97926) pythongh-97850: Remove all known instances of module_repr() (python#97876) I changed my surname early this year (python#96671) pythongh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (python#97768) pythongh-91539: improve performance of get_proxies_environment (python#91566) build(deps): bump actions/stale from 5 to 6 (python#97701) pythonGH-95172 Make the same version `versionadded` oneline (python#95172) pythongh-88050: Fix asyncio subprocess to kill process cleanly when process is blocked (python#32073) pythongh-93738: Documentation C syntax (Function glob patterns -> literal markup) (python#97774) pythongh-93357: Port test cases to IsolatedAsyncioTestCase, part 2 (python#97896) pythongh-95196: Disable incorrect pickling of the C implemented classmethod descriptors (pythonGH-96383) pythongh-97758: Fix a crash in getpath_joinpath() called without arguments (pythonGH-97759) pythongh-74696: Pass root_dir to custom archivers which support it (pythonGH-94251) pythongh-97661: Improve accuracy of sqlite3.Cursor.fetchone docs (python#97662) pythongh-87092: bring compiler code closer to a preprocessing-opt-assembler organisation (pythonGH-97644) pythonGH-96704: Add {Task,Handle}.get_context(), use it in call_exception_handler() (python#96756) pythongh-93738: Documentation C syntax (:c:type:`PyTypeObject*` -> :c:expr:`PyTypeObject*`) (python#97778) pythongh-97825: fix AttributeError when calling subprocess.check_output(input=None) with encoding or errors args (python#97826) Add re.VERBOSE flag documentation example (python#97678) ...
Make sure `__spec__.cached` (at minimum) can be used.
Make sure `__spec__.cached` (at minimum) can be used.
* main: fixes pythongh-96078: os.sched_yield release the GIL while calling sched_yield(2). (pythongh-97965) pythongh-65961: Do not rely solely on `__cached__` (pythonGH-97990) pythongh-97850: Remove the open issues section from the import reference (python#97935) Docs: pin sphinx-lint (pythonGH-97992) pythongh-94590: add signatures to operator itemgetter, attrgetter, methodcaller (python#94591) Add Pynche's move to the What's new in 3.11 (python#97974) pythongh-97781: Apply changes from importlib_metadata 5. (pythonGH-97785) pythongh-86482: Document assignment expression need for ()s (python#23291) pythongh-97943: PyFunction_GetAnnotations should return a borrowed reference. (python#97949) pythongh-94808: Coverage: Test that maximum indentation level is handled (python#95926)
* main: (53 commits) pythongh-94808: Coverage: Test that maximum indentation level is handled (python#95926) pythonGH-88050: fix race in closing subprocess pipe in asyncio (python#97951) pythongh-93738: Disallow pre-v3 syntax in the C domain (python#97962) pythongh-95986: Fix the example using match keyword (python#95989) pythongh-97897: Prevent os.mkfifo and os.mknod segfaults with macOS 13 SDK (pythonGH-97944) pythongh-94808: Cover `PyUnicode_Count` in CAPI (python#96929) pythongh-94808: Cover `PyObject_PyBytes` case with custom `__bytes__` method (python#96610) pythongh-95691: Doc BufferedWriter and BufferedReader (python#95703) pythonGH-88968: Add notes about socket ownership transfers (python#97936) pythongh-96865: [Enum] fix Flag to use CONFORM boundary (pythonGH-97528) pythongh-65961: Raise `DeprecationWarning` when `__package__` differs from `__spec__.parent` (python#97879) docs(typing): add "see PEP 675" to LiteralString (python#97926) pythongh-97850: Remove all known instances of module_repr() (python#97876) I changed my surname early this year (python#96671) pythongh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (python#97768) pythongh-91539: improve performance of get_proxies_environment (python#91566) build(deps): bump actions/stale from 5 to 6 (python#97701) pythonGH-95172 Make the same version `versionadded` oneline (python#95172) pythongh-88050: Fix asyncio subprocess to kill process cleanly when process is blocked (python#32073) pythongh-93738: Documentation C syntax (Function glob patterns -> literal markup) (python#97774) ...
* main: pythonGH-97002: Prevent `_PyInterpreterFrame`s from backing more than one `PyFrameObject` (pythonGH-97996) pythongh-97973: Return all necessary information from the tokenizer (pythonGH-97984) fixes pythongh-96078: os.sched_yield release the GIL while calling sched_yield(2). (pythongh-97965) pythongh-65961: Do not rely solely on `__cached__` (pythonGH-97990) pythongh-97850: Remove the open issues section from the import reference (python#97935) Docs: pin sphinx-lint (pythonGH-97992) pythongh-94590: add signatures to operator itemgetter, attrgetter, methodcaller (python#94591) Add Pynche's move to the What's new in 3.11 (python#97974) pythongh-97781: Apply changes from importlib_metadata 5. (pythonGH-97785) pythongh-86482: Document assignment expression need for ()s (python#23291) pythongh-97943: PyFunction_GetAnnotations should return a borrowed reference. (python#97949)
… from `__spec__.parent` (python#97879) Also remove `importlib.util.set_package()` which was already slated for removal. Co-authored-by: Eric Snow <[email protected]>
Make sure `__spec__.cached` (at minimum) can be used.
…H-124377) The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system.
…ed__` (pythonGH-124377) The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system. (cherry picked from commit 67201ad) Co-authored-by: Brett Cannon <[email protected]>
Another note for each section in @brettcannon's TODO list above:
It's the same The That change will also eliminate a historical slight behavioural difference between direct script execution and
(by contrast, there's no discrepancy for |
…hed__` (GH-124377) (#124380) * GH-65961: Document the deprecation of `__package__` and `__cached__` (GH-124377) The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system. (cherry picked from commit 67201ad) --------- Co-authored-by: Brett Cannon <[email protected]> Co-authored-by: Barry Warsaw <[email protected]>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
__package__
and__cached__
#124377__package__
and__cached__
(GH-124377) #124380The text was updated successfully, but these errors were encountered: