Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update the import machinery to only use __spec__ #65961

Open
ericsnowcurrently opened this issue Jun 14, 2014 · 15 comments
Open

update the import machinery to only use __spec__ #65961

ericsnowcurrently opened this issue Jun 14, 2014 · 15 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-importlib type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Jun 14, 2014

BPO 21762
Nosy @warsaw, @brettcannon, @ncoghlan, @ericsnowcurrently, @nanjekyejoannah
Dependencies
  • bpo-25791: Raise an ImportWarning when spec.parent/package isn't defined for a relative import
  • bpo-42132: Use specs instead of just loader in C code
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = None
    created_at = <Date 2014-06-14.21:33:51.833>
    labels = ['interpreter-core', 'type-feature']
    title = 'update the import machinery to only use __spec__'
    updated_at = <Date 2022-02-04.20:50:24.662>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2022-02-04.20:50:24.662>
    actor = 'brett.cannon'
    assignee = 'brett.cannon'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2014-06-14.21:33:51.833>
    creator = 'eric.snow'
    dependencies = ['25791', '42132']
    files = []
    hgrepos = []
    issue_num = 21762
    keywords = []
    message_count = 13.0
    messages = ['220581', '220597', '220606', '220610', '258332', '258357', '258398', '258440', '258443', '258844', '258845', '379490', '412537']
    nosy_count = 5.0
    nosy_names = ['barry', 'brett.cannon', 'ncoghlan', 'eric.snow', 'nanjekyejoannah']
    pr_nums = []
    priority = 'low'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue21762'
    versions = ['Python 3.6']

    Linked PRs

    @ericsnowcurrently
    Copy link
    Member Author

    With PEP-451, Python 3.4 introduced module specs to encapsulate the module's import-related information, particularly for loading. While __loader__, __file__, and __cached__ are no longer used by the import machinery, in a few places it still uses __name__, __package__, and __path__.

    Typically the spec and the module attrs will have the same values, so it would be a non-issue. However, the import-related module attributes are not read-only and the consequences of changing them (i.e. accidentally or to rely on an implementation detail) are not clearly defined. Making the spec strictly authoritative reduces the likelihood accidental changes and gives a better focus point for a module's import behavior (which was kind of the point of PEP-451 in the first place). Furthermore, objects in sys.modules are not required to be modules. By relying strictly on __spec__ we likewise give a more distinct target (of import-related info) for folks that need to use that trick.

    I don't recall the specifics on why we didn't change those 3 attributes for PEP-451 (unintentional or for backward compatibility?). At one point we discussed the idea that a module's spec contains the values that *were* used to load the module. Instead, each spec became the image of how the import system sees and treats the module.

    So unless there's some solid reason, I'd like to see the use of __name__, __package__, and __path__ by the import machinery eliminated (and accommodated separately if appropriate). Consistent use of specs in the import machinery will help limit future surprises.

    Here are the specific places:

    __name__
    --------

    mod.__repr__()
    ExtensionFileLoader.load_module()
    importlib._bootstrap._handle_fromlist()
    importlib._bootstrap._calc___package__()
    importlib._bootstrap.__import__()

    __package__
    -----------

    importlib._bootstrap._calc___package__()

    __path__
    --------

    importlib._bootstrap._find_and_load_unlocked()
    importlib._bootstrap._handle_fromlist()
    importlib._bootstrap._calc___package__()

    __file__
    --------

    mod.__repr__()

    Note that I'm not suggesting the module attributes be eliminated (they are useful for informational reasons). I would just like the import system to stop using them. I suppose they could be turned into read-only properties, but anything like that should be addressed in a separate issue.

    If we do make this change, the language reference, importlib docs, and inspect docs should be updated to clearly reflect the role of the module attributes in the import system.

    I have not taken into account the impact on the standard library. However, I expect that it will be minimal at best. (See issue bpo-21736 for a related discussion).

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jun 14, 2014
    @ncoghlan
    Copy link
    Contributor

    Manipulating name, package and path at runtime is fully supported, and the
    module level attributes accordingly take precedence over the initial import
    time spec.

    There may be some test suite gaps and documentation issues around the
    behaviour, but it's definitely intentional (things like runpy,
    "pseudo-modules", third party namespace package support and workarounds for
    running modules inside packages correctly rely on it).

    @ericsnowcurrently
    Copy link
    Member Author

    Thanks for clarifying. I remembered discussing it but couldn't recall the details. Documenting the exact semantics, use cases, and difference between spec and module attrs would be help. I'll look into updating the language reference when I have some time.

    It would still be worth it to find a way to make __spec__ fully authoritative, but I'll have to come up with a solution to the current use cases before that could go anywhere. :)

    @ncoghlan
    Copy link
    Contributor

    The spec is authoritative for "how was this imported?". The differences between that and the module attributes then provide a record of any post-import modifications.

    @brettcannon
    Copy link
    Member

    So I am going to disagree with Nick about the module attributes and their usefulness (partially because I just made __spec__.parent take precedence over __package__ in issue bpo-25791). While I get the idea of wanting a history of what did (not) change since import, keeping the duplicate information around is annoying. And I don't know how truly useful it is to know what things were compared to what they became.

    If we shift to preferring specs compared to module attributes we can then begin to clean up __import__ itself by no longer grabbing the globals() and locals() and instead simply pass in the module's __spec__ object. It also simplifies the documentation such that we don't have to explain everything twice. If people really want to track what a value was relating to import before mutation they can simply store it themselves instead of making us do the bookkeeping for them. It would also make things such as module_from_spec() or loader.create_module() simpler since they only have to worry about setting __spec__ instead of that attribute plus a bunch of other ones.

    @ncoghlan
    Copy link
    Contributor

    My concern is more about backwards compatibility - at the moment, you can alter the behaviour of import, pickle, and other subsystems by modifying the module level attributes, and if we switch to preferring the __spec__ attributes, then that kind of code will break (I added an import specific example related to __main__ module relative imports to the linked issue).

    That's not to say it shouldn't be done - as you say, it would be nice to eventually get to a point where the import system only needs access to the module spec and not to the runtime state, and there are also cases where the __spec__ information will be more correct (e.g. pickling objects in __main__).

    However, it needs to be in such a way that there are appropriate porting notes that explain to people why their state mutations stopped having the desired effect, and what (if anything) they can do instead.

    @brettcannon
    Copy link
    Member

    I totally agree proper notes in the What's New doc need to be in place to explain that people need to update.

    How about I tweak the __package__ change to continue to prefer __package__ over __spec__.parent, but raise an ImportWarning when they differ? It can also fall back to __spec__.parent if __package__ isn't defined and simply not worry about the lack of __package__? Then we can do an ImportWarning for all of the other attributes when we discover a difference so people have time to migrate to updating both locations, and then eventually we can invert the priority and then after that drop the module attributes.

    @ncoghlan
    Copy link
    Contributor

    That approach sounds good to me.

    The main problem cases I'm aware of are:

    __name__:

    • reliably wrong in __main__
    • the attribute you mess with if you want __qualname__ on functions and classes to be different so that pickle will import them from somewhere else (e.g. a parent package)

    __path__:

    • used for dynamic package definitions (including namespace package emulation)

    __package__:

    • AFAIK, mainly useful as a workaround for other people doing "bad things" (TM), like running package submodules directly as __main__ rather than using the -m switch

    @brettcannon
    Copy link
    Member

    Yeah, which is why it will take a transition to get people to start mucking with spec instead of the module attributes for their legitimate/questionable needs (although the whole __name__ == '__main__' idiom means name will never go away while path, package, loader, file, and cached could all slowly go away).

    @brettcannon
    Copy link
    Member

    __package__ != __spec__.parent now triggers an ImportWarning.

    @brettcannon
    Copy link
    Member

    I think that leaves the following attributes to be updated/checked for dependencies in importlib (and if they are found, raise ImportWarning when they differ):

    1. __path__
    2. __loader__
    3. __file__
    4. __cached__

    @brettcannon
    Copy link
    Member

    It turns out that the import system itself doesn't use __loader__ (it does set it), but various parts of the stdlib do use __loader__. bpo-42133 updates a bunch of stdlib modules to use __spec__.loader as a fallback. bpo-42132 tracks the fact that there's C code which isn't setting or using __spec__ (let alone __spec__.loader).

    @brettcannon brettcannon self-assigned this Dec 8, 2021
    @brettcannon
    Copy link
    Member

    brettcannon commented Feb 4, 2022

    Plan (as discussed with @warsaw ):

    1. __package__
      • Make sure all uses of the attribute fall back on __spec__ (done way back when)
      • Add an ImportWarning when the attribute is used but it differs from __spec__ (3.6)
      • Update code to prefer the spec over the attribute, raising ImportWarning when having to fall back to the attribute (3.10)
      • Change ImportWarning to DeprecationWarning when falling back to the attribute (3.12; gh-65961: Raise DeprecationWarning when __package__ differs from __spec__.parent #97879)
      • Document the deprecation
      • Remove code in importlib that set and used the old attribute (3.15)
    2. __loader__
      • Make sure all Python code uses of the attribute fall back on __spec__ (3.10)
      • Update C code to fall back to using __spec__ (issue)
      • Add a DeprecationWarning when __loader__ != __spec__.loader or __spec__.loader is not set.
      • Remove code in importlib that set and used the old attribute (3.15)
    3. __cached__
      • Make sure all uses of the attribute fall back on __spec__ (PR
      • ~Add an DeprecationWarning when the attribute is used but it differs from __spec__ (including not being set on __spec__) or __spec__.cached.
      • Document the deprecation
      • Remove code in importlib that used the old attribute (3.14)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @warsaw
    Copy link
    Member

    warsaw commented Oct 4, 2022

    Update C code to fall back to using spec

    #97803 is the PR and #86298 is the ticket. Only _warnings.c needs to be changed.

    brettcannon added a commit to brettcannon/cpython that referenced this issue Oct 5, 2022
    … from `__spec__.parent`
    
    Also remove `importlib.util.set_package()` which was already slated for removal.
    brettcannon added a commit that referenced this issue Oct 5, 2022
    …`__spec__.parent` (#97879)
    
    Also remove `importlib.util.set_package()` which was already slated for removal.
    
    Co-authored-by: Eric Snow <[email protected]>
    carljm added a commit to carljm/cpython that referenced this issue Oct 6, 2022
    * main: (66 commits)
      pythongh-65961: Raise `DeprecationWarning` when `__package__` differs from `__spec__.parent` (python#97879)
      docs(typing): add "see PEP 675" to LiteralString (python#97926)
      pythongh-97850: Remove all known instances of module_repr() (python#97876)
      I changed my surname early this year (python#96671)
      pythongh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (python#97768)
      pythongh-91539: improve performance of get_proxies_environment  (python#91566)
      build(deps): bump actions/stale from 5 to 6 (python#97701)
      pythonGH-95172 Make the same version `versionadded` oneline (python#95172)
      pythongh-88050: Fix asyncio subprocess to kill process cleanly when process is blocked (python#32073)
      pythongh-93738: Documentation C syntax (Function glob patterns -> literal markup) (python#97774)
      pythongh-93357: Port test cases to IsolatedAsyncioTestCase, part 2 (python#97896)
      pythongh-95196: Disable incorrect pickling of the C implemented classmethod descriptors (pythonGH-96383)
      pythongh-97758: Fix a crash in getpath_joinpath() called without arguments (pythonGH-97759)
      pythongh-74696: Pass root_dir to custom archivers which support it (pythonGH-94251)
      pythongh-97661: Improve accuracy of sqlite3.Cursor.fetchone docs (python#97662)
      pythongh-87092: bring compiler code closer to a preprocessing-opt-assembler organisation (pythonGH-97644)
      pythonGH-96704: Add {Task,Handle}.get_context(), use it in call_exception_handler() (python#96756)
      pythongh-93738: Documentation C syntax (:c:type:`PyTypeObject*` -> :c:expr:`PyTypeObject*`) (python#97778)
      pythongh-97825: fix AttributeError when calling subprocess.check_output(input=None) with encoding or errors args (python#97826)
      Add re.VERBOSE flag documentation example (python#97678)
      ...
    brettcannon added a commit to brettcannon/cpython that referenced this issue Oct 6, 2022
    Make sure `__spec__.cached` (at minimum) can be used.
    brettcannon added a commit that referenced this issue Oct 6, 2022
    Make sure `__spec__.cached` (at minimum) can be used.
    carljm added a commit to carljm/cpython that referenced this issue Oct 6, 2022
    * main:
      fixes pythongh-96078: os.sched_yield release the GIL while calling sched_yield(2). (pythongh-97965)
      pythongh-65961: Do not rely solely on `__cached__` (pythonGH-97990)
      pythongh-97850: Remove the open issues section from the import reference (python#97935)
      Docs: pin sphinx-lint (pythonGH-97992)
      pythongh-94590: add signatures to operator itemgetter, attrgetter, methodcaller (python#94591)
      Add Pynche's move to the What's new in 3.11 (python#97974)
      pythongh-97781: Apply changes from importlib_metadata 5. (pythonGH-97785)
      pythongh-86482: Document assignment expression need for ()s (python#23291)
      pythongh-97943: PyFunction_GetAnnotations should return a borrowed reference. (python#97949)
      pythongh-94808: Coverage: Test that maximum indentation level is handled (python#95926)
    carljm added a commit to carljm/cpython that referenced this issue Oct 8, 2022
    * main: (53 commits)
      pythongh-94808: Coverage: Test that maximum indentation level is handled (python#95926)
      pythonGH-88050: fix race in closing subprocess pipe in asyncio  (python#97951)
      pythongh-93738: Disallow pre-v3 syntax in the C domain (python#97962)
      pythongh-95986: Fix the example using match keyword (python#95989)
      pythongh-97897: Prevent os.mkfifo and os.mknod segfaults with macOS 13 SDK (pythonGH-97944)
      pythongh-94808: Cover `PyUnicode_Count` in CAPI (python#96929)
      pythongh-94808: Cover `PyObject_PyBytes` case with custom `__bytes__` method (python#96610)
      pythongh-95691: Doc BufferedWriter and BufferedReader (python#95703)
      pythonGH-88968: Add notes about socket ownership transfers (python#97936)
      pythongh-96865: [Enum] fix Flag to use CONFORM boundary (pythonGH-97528)
      pythongh-65961: Raise `DeprecationWarning` when `__package__` differs from `__spec__.parent` (python#97879)
      docs(typing): add "see PEP 675" to LiteralString (python#97926)
      pythongh-97850: Remove all known instances of module_repr() (python#97876)
      I changed my surname early this year (python#96671)
      pythongh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (python#97768)
      pythongh-91539: improve performance of get_proxies_environment  (python#91566)
      build(deps): bump actions/stale from 5 to 6 (python#97701)
      pythonGH-95172 Make the same version `versionadded` oneline (python#95172)
      pythongh-88050: Fix asyncio subprocess to kill process cleanly when process is blocked (python#32073)
      pythongh-93738: Documentation C syntax (Function glob patterns -> literal markup) (python#97774)
      ...
    carljm added a commit to carljm/cpython that referenced this issue Oct 8, 2022
    * main:
      pythonGH-97002: Prevent `_PyInterpreterFrame`s from backing more than one `PyFrameObject` (pythonGH-97996)
      pythongh-97973: Return all necessary information from the tokenizer (pythonGH-97984)
      fixes pythongh-96078: os.sched_yield release the GIL while calling sched_yield(2). (pythongh-97965)
      pythongh-65961: Do not rely solely on `__cached__` (pythonGH-97990)
      pythongh-97850: Remove the open issues section from the import reference (python#97935)
      Docs: pin sphinx-lint (pythonGH-97992)
      pythongh-94590: add signatures to operator itemgetter, attrgetter, methodcaller (python#94591)
      Add Pynche's move to the What's new in 3.11 (python#97974)
      pythongh-97781: Apply changes from importlib_metadata 5. (pythonGH-97785)
      pythongh-86482: Document assignment expression need for ()s (python#23291)
      pythongh-97943: PyFunction_GetAnnotations should return a borrowed reference. (python#97949)
    mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
    … from `__spec__.parent` (python#97879)
    
    Also remove `importlib.util.set_package()` which was already slated for removal.
    
    Co-authored-by: Eric Snow <[email protected]>
    mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
    Make sure `__spec__.cached` (at minimum) can be used.
    brettcannon added a commit to brettcannon/cpython that referenced this issue Sep 23, 2024
    brettcannon added a commit that referenced this issue Sep 23, 2024
    …H-124377)
    
    The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 23, 2024
    …ed__` (pythonGH-124377)
    
    The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system.
    (cherry picked from commit 67201ad)
    
    Co-authored-by: Brett Cannon <[email protected]>
    @ncoghlan
    Copy link
    Contributor

    Another note for each section in @brettcannon's TODO list above:

    • Remove code in runpy that sets the old module attribute

    It's the same globals update call for all three of the old attributes: https://github.com/python/cpython/blob/main/Lib/runpy.py#L81

    The pkg_name parameters on the _run_code and _run_module_code helper functions in runpy will also become redundant (since they're only used to set __package__) and should be removed.

    That change will also eliminate a historical slight behavioural difference between direct script execution and runpy.run_path in the way they set __package__ to indicate a top-level module (default is None, runpy.run_path sets the empty string):

    $ echo "print(__package__)" > package.py
    $ python3 package.py
    None
    $ python3 -c "import runpy; runpy.run_path('package.py')"
    
    $
    

    (by contrast, there's no discrepancy for __spec__: they both set it to None for direct script execution)

    Yhg1s pushed a commit that referenced this issue Sep 27, 2024
    …hed__` (GH-124377) (#124380)
    
    * GH-65961: Document the deprecation of `__package__` and `__cached__` (GH-124377)
    
    The code changes for warning related to `__package__` landed in Python 3.12. `__cached__` doesn't have any changes as it isn't used but only set by the import system.
    (cherry picked from commit 67201ad)
    
    ---------
    
    Co-authored-by: Brett Cannon <[email protected]>
    Co-authored-by: Barry Warsaw <[email protected]>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-importlib type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants