diff --git a/pep-0648.rst b/pep-0648.rst index 5707b79beac..18f04f1436b 100644 --- a/pep-0648.rst +++ b/pep-0648.rst @@ -8,14 +8,14 @@ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Dec-2020 -Python-Version: 3.10 +Python-Version: 3.11 Post-History: python-ideas: 16th Dec. python-dev: 18th Dec. Abstract ======== This PEP proposes supporting extensible customization of the interpreter by -allowing users to install scripts that will be executed at startup. +allowing users to install files that will be executed at startup. Motivation ========== @@ -25,7 +25,7 @@ libraries need to customize aspects of the interpreter at startup time. This is usually achieved via ``sitecustomize.py`` for system administrators whilst libraries rely on exploiting ``pth`` files. This PEP proposes a way of -achieving the same in a more user-friendly and structured way. +achieving the same functionality in a more user-friendly and structured way. Limitations of ``pth`` files ---------------------------- @@ -76,127 +76,269 @@ example, Ubuntu could change their current ``sitecustomize.py`` to just be also gives users of the interpreter a better understanding of the modifications happening on their interpreter. -Benefits of ``__sitecustomize__`` ---------------------------------- - -Having a structured way of injecting custom startup scripts will allow -supporting the cases presented above in a better way. It will result in both -maintainers and users having a better experience as detailed, and allow -CPython to deprecate and eventually remove code execution from ``pth`` files -as desired in the previously mentioned bpos. Additionally, these solutions -provide a unique way to support all use-cases that before have been fulfilled -via the misuse of ``pth`` files, ``sitecustomize.py`` and -``usercustomize.py``. The use of a ``__sitecustomize__`` will allow for -packages, tools and system admins to inject scripts that will be loaded at -startups through an easy-to-understand mechanism. - Rationale ========= This PEP proposes supporting extensible customization of the interpreter at -startup by allowing users to install scripts into a folder named -``__sitecustomize__`` located in any site path. Those scripts will be executed -at startup time. The implementation will take advantage of the fact that all -site paths are already being walked to look for ``pth`` files to include also a -check for ``__sitecustomize__`` folders and execute all scripts within them. - -The ``site`` module will expose an option on its main function that allows -listing all scripts that will be executed, which will allow users to quickly -see all customizations that affect an interpreter. - -We will also work with packaging build backends to facilitate the installation -of these files. +startup by executing all files discovered in directories named +``__sitecustomize__`` in sitepackages [#sitepackages-api]_ or +usersitepackages [#usersitepackages-api]_ at startup time. Why ``__sitecustomize__`` ------------------------- The name aims to follow the already existing concept of ``sitecustomize.py``. -As the folder will be within ``sys.path``, given that it is located in site -paths, we choose to use double underscore around its name, to prevent +As the directory will be within ``sys.path``, given that it is located in +site paths, we choose to use double underscore around its name, to prevent colliding with the already existing ``sitecustomize.py``. -Disabling start scripts ------------------------ +Discovering the new ``__sitecustomize__`` directories +----------------------------------------------------- -In some scenarios, like when the startup time is key, it might be desired to -disable this option altogether. Whilst we could add a new flag to do so, we -think that the already existing flag ``-S`` [#s-flag]_ is already good -enough, as it disables all ``site``-related manipulation. If the flag is -passed in, ``__sitecustomize__`` will not be used. +The Python interpreter will look at startup for directory named +``__sitecustomize__`` within any of the standard site-packages path. -Order of execution ------------------- +These are commonly the Python system location and the user location, but are +ultimately defined by the site module logic. -The scripts in ``__sitecustomize__`` will be executed in file name sorted order -after the evaluation of ``pth`` files. We considered executing them in random -order, but that could result in different results depending on how the -interpreter chooses to pick up those files. So even if it won't be a good -practice to rely on other files being executed, we think that is better than -having randomly different results on interpreter startup. -We chose to run the scripts after the ``pth`` files in case a user needs to -add items to the path before running a script. +Users can use ``site.sitepackages`` [#sitepackages-api]_ and +``site.usersitepackages`` [#usersitepackages-api]_ to know the paths where +the interpreter can discover ``__sitecustomize`` directories. -Note the execution happens after the handling of ``pth`` files for each of the -site paths and that the ``__sitecustomize__`` folder need to be in site paths -and not in just any importable path. +Time of ``__sitecustomize__`` discovery +--------------------------------------- -Impact on startup time ----------------------- +The ``__sitecustomize__`` directories will be discovered exactly after ``pth`` +files are discovered in a site-packages path as part of ``site.addsitedir`` +[#siteaddsitedir]_. + +These is repeated for each of the site-packages path in the exact same order +that is being followed today for ``pth`` files. + +Order of execution within ``__sitecustomize__`` +----------------------------------------------- -If an interpreter is not using this mechanism, the impact on performance is -expected to be minimal as this PEP just adds a check for -``__sitecustomize__`` when ``site.py`` is walking the site paths looking for -``pth`` files. This impact will be reduced in the future as we will remove -two other imports: "sitecustomize.py" and "usercustomize.py". +The implementation will execute the files within ``__sitecustomize__`` by +sorting them by name when discovering each of the ``__sitecustomize__`` +directories. We discourage users to rely on the order of execution though. -If the user has custom scripts, we think that the impact on the performance -of walking each of the folders is acceptable, as the user wants to use this -feature. If they need to run a time-sensitive application, they can always -use ``-S`` to disable this entirely. +We considered executing them in random order, but that could result in +different results depending on how the interpreter chooses to pick up those +files. So even if it won't be a good practice to rely on other files being +executed, we think that is better than having randomly different results on +interpreter startup. We chose to run the files after the ``pth`` files in +case a user needs to add items to the path before running a files. -Running "./python -c pass" with perf on 50 iterations, repeating 50 times the -command on each and getting the geometric mean on a commodity laptop did not -reveal any substantial raise on CPU time. +Interaction with ``pth`` files +------------------------------ + +``pth`` files can be used to add paths into ``sys.path``, but this should not +affect the ``__sitecustomize__`` discovery process, as those directories are +looked up exclusively in site-packages paths. + +Execution of files within ``__sitecustomize__`` +----------------------------------------------- + +When a ``__sitecustomize__`` directory is discovered, all of the files that +have a ``.py`` extension within it will be read with ``io.open_code`` and +executed by using ``exec`` [#exec]_. + +An empty dictionary will be passed as ``globals`` to the ``exec`` function +to prevent unexpected interactions between different files. Failure handling ---------------- -Any error on any of the scripts will not be logged unless the interpreter is -run in verbose mode and it should not stop the evaluation of other scripts. -The user will just receive a message in stderr saying that the script failed to -be executed and that verbose mode can be used to get more information. This -behaviour follows the one already existing for ``sitecustomize.py``. +Any error on the execution of any of the files will not be logged unless the +interpreter is run in verbose mode and it should not stop the evaluation of +other files. The user will receive a message in stderr saying that the file +failed to be executed and that verbose mode can be used to get more +information. This behaviour mimics the one existing for ``sitecustomize.py``. -Scripts naming convention -------------------------- +Interaction with virtual environments +------------------------------------- -Packages will be encouraged to include the name of the package within the name -of the script to avoid collisions between packages. The only requirement on the -filename is that it ends in ``.py`` for the interpreter to execute them. +The customizations applied to an interpreter via the new +``__sitecustomize__`` solutions will continue to work when an user creates a +virtual environment the same way that ``sitecustomize.py`` +interact with virtual environments. -Relationship with sitecustomize and usercustomize -------------------------------------------------- +This is a difference when compared to ``pth`` files, which are not propagated +into virtual environments unless ``include-system-site-packages`` is enabled. -The existing logic for ``sitecustomize.py`` and ``usercustomize.py`` will be left -as is, later deprecated and scheduled for removal. Once ``__sitecustomize__`` is -supported, it will provide better integration for all existing users, and even -if it will indeed require a migration for system administrators, we expect the -effort required to be minimal, it will just require moving and renaming the -current ``sitecustomize.py`` into the new provided folder. +If library maintainers have features installed via ``__sitecustomize__`` that +they do not want to propagate into virtual environments, they should detect +if they are running within a virtual environment by checking ``sys.prefix == +sys.base_prefix``. This behavior is similar to packages that modify the global +``sitecustomize.py``. -Identifying all installed scripts ---------------------------------- +Interaction with ``sitecustomize.py`` and ``usercustomize.py`` +-------------------------------------------------------------- + +Until removed, ``sitecustomize`` and ``usercustomize`` will be executed after +``__sitecustomize__`` similar to pth files. See the Backward compatibility +section for information on removal plans for ``sitecustomize`` and +``usercustomize``. + +Identifying all installed files +------------------------------- + +To facilitate debugging of the Python startup, if the site module is invoked +it will print the ``__sitecustomize__`` directories that will be discovered +on startup. + +Files naming convention +----------------------- -To facilitate debugging of the Python startup, a new option will be added to -the main of the site module to list all scripts that will be executed as part -of the ``__sitecustomize__`` initialization. +Packages will be encouraged to include the name of the package within the +name of the file to avoid collisions between packages. But the only +requirement on the filename is that it ends in ``.py`` for the interpreter to +execute them. + +Disabling start files +--------------------- + +In some scenarios, like when the startup time is key, it might be desired to +disable this option altogether. The already existing flag ``-S`` [#s-flag]_ +will disable all ``site``-related manipulation, including this new feature. +If the flag is passed in, ``__sitecustomize__`` directories will not be +discovered. + +Additionally, to allow for starting the interpreter disabling only this new +feature a new option will be added under ``-X``: ``disablesitecustomize``, +which will disable the discovery of ``__sitecustomize__`` exclusively. + +Lastly, the user can disable the discovery of ``__sitecustomize__`` +directories only in the user site by disabling the user site via any of the +multiple options in the ``site.py`` module. + +Support in build backends +------------------------- + +Whilst build backends can choose to provide an option to facilitate the +installation of these files into a ``__sitecustomize__`` directory, this +PEP does not address that directly. Similar to ``pth`` files, build backends +can choose to not provide an easy to configure mechanism for +``__sitecustomize__`` files and let users hook into the installation +process to include such files. We do not think build backends enhanced +support as a requirement for this PEP. + +Impact on startup time +---------------------- + +A concern in this implementation is how Python interpreter startup time can +be affected by this addition. We expect the performance impact to be highly +coupled to the logic in the files that a user or sysadmin installs in the +Python environment being tested. + +If the interpreter has any files in their ``__sitecustomize__`` directory, +the file execution time plus a call reading the code will be added to the +startup time. This is similar to how code execution is impacting startup time +through ``sitecustomize.py``, ``usercustomize.py`` and code in ``pth`` files. +We will therefore focus here on comparing this solution against those three, +as otherwise the actual time added to startup is highly dependent on the code +that is being executed in those files. + +Results were gathered by running "./python.exe -c pass" with perf on 50 +iterations, repeating 50 times the command on each iteration and getting the +geometric mean of all the results. The file used to run those benchmarks is +checked in in the reference implementation [#reference-implementation]_. + +The benchmark was run with 3.10 alpha 7 compiled with PGO and LTO with the +following parameters and system state: + +- Perf event: Max sample rate set to 1 per second +- CPU Frequency: Minimum frequency of CPU 17,35 set to the maximum frequency +- Turbo Boost (MSR): Turbo Boost disabled on CPU 17: MSR 0x1a0 set to 0x4000850089 +- IRQ affinity: Set default affinity to CPU 0-16,18-34 +- IRQ affinity: Set affinity of IRQ 1,3-16,21,25-31,56-59,68-85,87,89-90,92-93,95-104 to CPU 0-16,18-34 +- CPU: use 2 logical CPUs: 17,35 +- Perf event: Maximum sample rate: 1 per second +- ASLR: Full randomization +- Linux scheduler: Isolated CPUs (2/36): 17,35 +- Linux scheduler: RCU disabled on CPUs (2/36): 17,35 +- CPU Frequency: 0-16,18-34=min=1200 MHz, max=3600 MHz; 17,35=min=max=3600 MHz +- Turbo Boost (MSR): CPU 17,35: disabled + +The code placed to be executed in ``pth`` files, ``sitecustomize.py``, +``usercustomize.py`` and files within ``__sitecustomize__`` is the following: + + import time; x = time.time() ** 5 + +The file is aimed at execution a simple operation but still expected to be +negligible. This is to put the experiment in a situation where we make +visible any hit on performance due to the mechanism whilst still making it +relatively realistic. Additionally, it starts with an import and is a single +line to be able to be used in ``pth`` files. + +==== ==================== ==================== ======= ===================== ====== ===== +Test # of files Time (us) +---- -------------------------------------------------------------------------- ------------- + # ``sitecustomize.py`` ``usercustomize.py`` ``pth`` ``__sitecustomize__`` Run 1 Run 2 +==== ==================== ==================== ======= ===================== ====== ===== + 1 0 0 0 Dir not created 13884 13897 + 2 0 0 0 0 13871 13818 + 3 0 0 1 0 13964 13924 + 4 0 0 0 1 13940 13939 + 5 1 1 0 0 13990 13993 + 6 0 0 0 2 (system + user) 14063 14040 + 7 0 0 50 0 16011 16014 + 8 0 0 0 50 15456 15448 +==== ==================== ==================== ======= ===================== ====== ===== + +Results can be reproduced with ``run-benchmark.py`` script provided in the +reference implementation [#reference-implementation]_. + +We interpret the following from these results: + +- Using two ``__sitecustomize__`` scripts compared to ``sitecustomize.py`` + and ``usercustomize.py`` slows down the interpreter by 0.3%. We expect this + slowdown until ``sitecustomize.py`` and ``usercustomize.py`` are removed in + a future release as even if the user does not create the files, the + interpreter will still attempt to import them. +- With the arbitrary 50 pth files with code tested, moving those to + ``__sitecustomize__`` produces a speedup of ~3.5% in startup. Which is likely + related to the simpler logic to evaluate ``__sitecustomize__`` files compared + to ``pth`` file execution. +- In general all measurements show that there is a low impact on startup time + with this addition. + +Audit Event +----------- + +A new audit event will be added and triggered on ``__sitecustomize__`` +execution to facilitate security inspection by calling ``sys.audit`` +[#sysaudit]_ with "sitecustimze.exec_file" as name and the filename as +argument. + + +Security implications +--------------------- + +This PEP aims to move all code execution from ``pth`` files to files within a +``__customize__`` directory. We think this is an improvement to system admins +for the following reasons: + +* Allows to quickly identify the code being executed at startup time by the + interpreter by looking into a single directory rather than having to scan + all ``pth`` files. + +* Allows to track usage of this feature through the new proposed audit event. + +* Gives finer grain control by allowing to tune permissions on the + ``__sitecustomize__`` directory, potentially allowing users to install only + packages that does not change the interpreter startup. + +In short, whilst this allows for a malicious users to drop a file that will +be executed at startup, it's an improvement compared to the existing ``pth`` +files. How to teach this ================= This can be documented and taught as simple as saying that the interpreter -will try to look for the ``__sitecustomize__`` folder at startup in its site -paths and if it finds any scripts with ``.py`` extension, it will then +will try to look for the ``__sitecustomize__`` directory at startup in its +site paths and if it finds any files with ``.py`` extension, it will then execute it one by one. For system administrators and tools that package the interpreter, we can now @@ -207,28 +349,28 @@ handle the logic they want to customize. Library developers should be able to specify a new argument on tools like setuptools that will inject those new files. Something like -``sitecustomize_scripts=["scripts/betterexceptions.py"]``, which allows them to +``sitecustomize_files=["scripts/betterexceptions.py"]``, which allows them to add those. Should the build backend not support that, they can manually install them as they used to do with ``pth`` files. We will recommend them to -include the name of the package as part of the script's name. +include the name of the package as part of the file's name. Backward compatibility ====================== -We propose to add support for ``__sitecustomize__`` in the next release of -Python, add a warning on the three next releases on the deprecation and -future removal of ``sitecustomize.py``, ``usercustomize.py`` and code execution -in ``pth`` files, and remove it after maintainers have had 4 releases to -migrate. Ignoring those lines in pth files. +This PEP adds a deprecation warning on ``sitecustomize.py``, +``usercustomize.py`` and ``pth`` code execution in 3.11, 3.12 and 3.13. With +plans on removing those features by 3.14. The migration from those solutions +to ``__sitecustomize__`` should ideally be just moving the logic into a +different file. -Whilst the existing ``sitecutzomize.py`` mechanism was created targeting +Whilst the existing ``sitecustomize.py`` mechanism was created targeting System Administrators that placed it in a site path, the file could be actually placed anywhere in the path at the time that the interpreter was starting up. The new mechanism does not allow for users to place -``__sitecustomize__`` folders anywhere in the path, but only in site paths. -System administrators can recover a similar behavior to ``sitecustomize.py`` -if they need it by adding a custom script in ``__sitecustomize__`` which just -imports ``sitecustomize`` as a migration path. +``__sitecustomize__`` directories anywhere in the path, but only in site +paths. System administrators can recover a similar behavior to +``sitecustomize.py`` by adding a custom file in ``__sitecustomize__`` which +just imports ``sitecustomize`` as a migration path. Reference Implementation ======================== @@ -246,8 +388,8 @@ Do nothing ---------- Whilst the current status "works" it presents the issues listed in the -motivation. After analysing the impact of this change, we believe it is worth -given the enhanced experience it brings. +motivation. After analyzing the impact of this change, we believe it is worth +it, given the enhanced experience it brings. Formalize using ``pth`` files ----------------------------- @@ -259,12 +401,13 @@ as listed in the motivation. Making ``__sitecustomize__`` a namespace package ------------------------------------------------ -We considered making the folder a namespace package and just import all the -modules within it, which allowed searching across all paths in ``sys.path`` -at initialization time and provided a way to declare dependencies between -scripts by importing each other. This was rejected for multiple reasons: +We considered making the directory a namespace package and just import all +the modules within it, which allowed searching across all paths in +``sys.path`` at initialization time and provided a way to declare +dependencies between files by importing each other. This was rejected for +multiple reasons: -1. This was unnecessarily broadening the list of paths where arbitrary scripts +1. This was unnecessarily broadening the list of paths where arbitrary files are executed. 2. The logic brought additional complexity, like what to do if a package were to install an ``__init__.py`` file in one of the locations. @@ -272,8 +415,8 @@ scripts by importing each other. This was rejected for multiple reasons: ``pth`` files already in the site paths compared to performing an actual import of a namespace package. -Support for shutdown custom scripts ------------------------------------ +Support for shutdown customization +---------------------------------- ``init.d`` users might be tempted to implement this feature in a way that users could also add code at shutdown, but extra support for that is not needed, as @@ -282,7 +425,7 @@ Python users can already do that via ``atexit``. Using entry_points ------------------ -We considered extending the use of entry points to allow specifying scripts +We considered extending the use of entry points to allow specifying files that should be executed at startup but we discarded that solution due to two main reasons. The first one being impact on startup time. This approach will require scanning all packages distribution information to just execute a @@ -291,9 +434,9 @@ using the feature and such impact growths linearly with the number of packages installed in the environment. The second reason was that the proposed implementation in this PEP offers a single solution for startup customization for packages and system administrators. Additionally, if the main objective of -entry points is to make it easy for libraries to install scripts at startup, +entry points is to make it easy for libraries to install files at startup, that can still be added and make the build backends just install the files -within the ``__sitecustomize__`` folder. +within the ``__sitecustomize__`` directory. Copyright ========= @@ -301,6 +444,12 @@ Copyright This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. +Acknowledgements +================ + +Thanks Pablo Galindo for contributing to this PEP and offering his PC to run +the benchmark. + References ========== @@ -324,3 +473,18 @@ References .. [#site] https://docs.python.org/3/library/site.html + +.. [#sitepackages-api] + https://docs.python.org/3/library/site.html?highlight=site#site.getsitepackages + +.. [#usersitepackages-api] + https://docs.python.org/3/library/site.html?highlight=site#site.getusersitepackages + +.. [#siteaddsitedir] + https://github.com/python/cpython/blob/5787ba4a45492e232f5470c7d2e93763198e4b22/Lib/site.py#L207 + +.. [#exec] + https://docs.python.org/3/library/functions.html#exec + +.. [#sysaudit] + https://docs.python.org/3/library/sys.html#sys.audit