diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index dc8ab1a95d5..b0c1d740ab8 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -512,6 +512,7 @@ pep-0654.rst @1st1 @gvanrossum pep-0655.rst @gvanrossum pep-0656.rst @brettcannon pep-0657.rst @pablogsal @isidentical @ammaraskar +pep-0658.rst @brettcannon # ... # pep-0666.txt # ... diff --git a/pep-0658.rst b/pep-0658.rst new file mode 100644 index 00000000000..9375e8f1ffc --- /dev/null +++ b/pep-0658.rst @@ -0,0 +1,181 @@ +PEP: 658 +Title: Static Distribution Metadata in the Simple Repository API +Author: Tzu-ping Chung +Sponsor: Brett Cannon +PEP-Delegate: Donald Stufft +Discussions-To: https://discuss.python.org/t/8651 +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 10-May-2021 +Post-History: 10-May-2021 +Resolution: + + +Abstract +======== + +This PEP proposes adding an anchor tag to expose the ``METADATA`` file +from distributions in the :pep:`503` "simple" repository API. A +``data-dist-info-metadata`` attribute is introduced to indicate where +the file from a given distribution can be independently fetched. + + +Motivation +========== + +Package management workflows made popular by recent tooling increase +the need to inspect distribution metadata without intending to install +the distribution, and download multiple distributions of a project to +choose from based on their metadata. This means they end up discarding +much downloaded data, which is inefficient and results in a bad user +experience. + + +Rationale +========= + +Tools have been exploring methods to reduce the download size by +partially downloading wheels with HTTP range requests. This, however, +adds additional run-time requirements to the repository server. It +also still adds additional overhead, since a separate request is +needed to fetch the wheel's file listing to find the correct offset to +fetch the metadata file. It is therefore desired to make the server +extract the metadata file in advance, and serve it as an independent +file to avoid the need to perform additional requests and ZIP +inspection. + +The metadata file defined by the Core Metadata Specification +[core-metadata]_ will be served directly by repositories since it +contains the necessary information for common use cases. The metadata +served must be completely static, i.e. identical to the ``METADATA`` +file in the ``.dist-info`` directory [dist-info]_ if the distribution +is installed. The repository can provide this for any distributions, +but it is expected they will only provide them for wheels [wheel]_ +at the current time, since an sdist [sdist]_ does not yet have a way +to promise the metadata will stay the same after it is built. + +Since not all distributions have static metadata, an HTML attribute +on the distribution file's anchor link is needed to indicate whether a +client is able to choose the separately served metadata file instead. +The attribute is also used to provide the metadata file's hash, so +clients can verify the file after download. If the attribute is +missing from an anchor link, static metadata is not available for the +distribution, either because of the distribution's content, or lack of +repository support. + + +Specification +============= + +In a simple repository's project page, each anchor tag pointing to a +distribution **MAY** have a ``data-dist-info-metadata`` attribute. The +presence of the attribute indicates the distribution represented by +the anchor tag **MUST** contain a Core Metadata file that will not be +modified when the distribution is processed and/or installed. + +If a ``data-dist-info-metadata`` attribute is present, the repository +**MUST** serve the distribution's Core Metadata file alongside the +distribution with a ``.metadata`` appended to the distribution's file +name. For example, the Core Metadata of a distribution served at +``/files/distribution-1.0-py3.none.any.whl`` would be located at +``/files/distribution-1.0-py3.none.any.whl.metadata``. This is similar +to how :pep:`503` specifies the GPG signature file's location. + +The repository **SHOULD** provide the hash of the Core Metadata file +as the ``data-dist-info-metadata`` attribute's value using the syntax +``=``, where ```` is the lower cased +name of the hash function used, and ```` is the hex encoded +digest. The repository **MAY** use ``true`` as the attribute's value +if a hash is unavailable. + + +Backwards Compatibility +======================= + +If an anchor tag lacks the ``data-dist-info-metadata`` attribute, +tools are expected to revert to their current behaviour of downloading +the distribution to inspect the metadata. + +Older tools not supporting the new ``data-dist-info-metadata`` +attribute are expected to ignore the attribute and maintain their +current behaviour of downloading the distribution to inspect the +metadata. This is similar to how prior ``data-`` attribute additions +expect existing tools to operate. + + +Rejected Ideas +============== + +Put metadata content on the project page +---------------------------------------- + +Since tools generally only need to dependency information from a +distribution in addition to what's already available on the project +page, it was proposed that repositories may directly include the +information on the project page, like the ``data-requires-python`` +attribute specified in :pep:`503`. + +This approach was abandoned since a distribution may contain +arbitrarily long lists of dependencies (including required and +optional), and it is unclear whether including the information for +every distribution in a project would result in net savings since the +information for most distributions generally ends up unneeded. By +serving the metadata separately, performance can be better estimated +since data usage will be more proportional to the number of +distributions inspected. + + +Expose more files in the distribution +------------------------------------- + +It was proposed to provide the entire ``.dist-info`` directory as a +separate part, instead of only the metadata file. However, searving +multiple files in one entity through HTTP requires re-archiving them +separately after they are extracted from the original distribution +by the repository server, and there are no current use cases for files +other than ``METADATA`` when the distribution itself is not going to +be installed. + +It should also be noted that the approach taken here does not +preclude other files from being introduced in the future, whether we +want to serve them together or individually. + + +Explicitly specify the metadata file's URL on the project page +-------------------------------------------------------------- + +An early version of this draft proposed putting the metadata file's +URL in the ``data-dist-info-metadata`` attribute. But people feel it +is better for discoverability to require the repository to serve the +metadata file at a determined location instead. The current approach +also has an additional benefit of making the project page smaller. + + +References +========== + +.. [core-metadata] https://packaging.python.org/specifications/core-metadata/ + +.. [dist-info] https://packaging.python.org/specifications/recording-installed-packages/ + +.. [wheel] https://packaging.python.org/specifications/binary-distribution-format/ + +.. [sdist] https://packaging.python.org/specifications/source-distribution-format/ + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: