-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 658: Static Distribution Metadata in the Simple Repository API #1955
Merged
Merged
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
PEP: 658 | ||
Title: Static Distribution Metadata in the Simple Repository API | ||
Author: Tzu-ping Chung <[email protected]> | ||
Sponsor: | ||
PEP-Delegate: | ||
Discussions-To: https://discuss.python.org/t/8651 | ||
Status: Draft | ||
Type: Standards Track | ||
Content-Type: text/x-rst | ||
Created: 10-May-2021 | ||
Post-History: 10-May-2021 | ||
Resolution: | ||
|
||
|
||
Abstract | ||
======== | ||
|
||
This PEP proposes adding an anchor tag to expose the ``METADATA`` file | ||
from distributions in the :pep:`503` "simple" repository API. A | ||
``data-dist-info-metadata`` attribute is introduced to indicate where | ||
the file from a given distribution can be independently fetched. | ||
|
||
|
||
Motivation | ||
========== | ||
|
||
Package management workflows made popular by recent tooling increase | ||
the need to inspect distribution metadata without intending to install | ||
the distribution, and download multiple distributions of a project to | ||
choose from based on their metadata. This means they end up discarding | ||
much downloaded data, which is inefficient and results in a bad user | ||
experience. | ||
|
||
|
||
Rationale | ||
========= | ||
|
||
Tools have been exploring methods to reduce the download size by | ||
partially downloading wheels with HTTP range requests. This, however, | ||
adds additional run-time requirements to the repository server. It | ||
also still adds additional overhead, since a separate request is | ||
needed to fetch the wheel's file listing to find the correct offset to | ||
fetch the metadata file. It is therefore desired to make the server | ||
extract the metadata file in advance, and serve it as an independent | ||
file to avoid the need to perform additional requests and ZIP | ||
inspection. | ||
|
||
The metadata file defined by the Core Metadata Specification | ||
[core-metadata]_ will be served directly by repositories since it | ||
contains the necessary information for common use cases. The metadata | ||
served must be completely static, i.e. identical to the ``METADATA`` | ||
file in the ``.dist-info`` directory [dist-info]_ if the distribution | ||
is installed. The repository can provide this for any distributions, | ||
but it is expected they will only provide them for wheels [wheel]_, | ||
since an sdist [sdist]_ does not currently have a way to promise the | ||
metadata will stay the same after it is built. | ||
|
||
Since not all distributions have static metadata, an HTML attribute | ||
on the distribution file's anchor link is needed to indicate whether a | ||
client is able to choose the separately served metadata file instead. | ||
The attribute can also be used denote whether the metadata file can be | ||
downloaded. If the attribute is missing from an anchor link, static | ||
metadata is not available for the distribution, either because of the | ||
distribution's content, or lack of repository support. | ||
|
||
|
||
Specification | ||
============= | ||
|
||
In a simple repository's project page, each anchor tag pointing to a | ||
distribution **MAY** have a ``data-dist-info-metadata`` attribute. The | ||
presence of the attribute indicates the distribution represented by | ||
the anchor tag **MUST** contain a Core Metadata file that will not be | ||
modified when the distribution is processed and/or installed. | ||
|
||
If a ``data-dist-info-metadata`` attribute is present, its value | ||
**MUST** be a URL to the distribution's Core Metadata file. If the URL | ||
is relative, its base URL **MUST** be the current project page, as is | ||
the behaviour of an anchor tag's ``href`` attribute. | ||
|
||
There are no restrictions where the Core Metadata file should be | ||
hosted relative to the distribution file or project page, as long as | ||
it can be reached when accessed. | ||
|
||
|
||
Backwards Compatibility | ||
======================= | ||
|
||
If an anchor tag lacks the ``data-dist-info-metadata`` attribute, | ||
tools are expected to revert to their current behaviour of downloading | ||
the distribution to inspect the metadata. | ||
|
||
Older tools not supporting the new ``data-dist-info-metadata`` | ||
attribute are expected to ignore the attribute and maintain their | ||
current behaviour of downloading the distribution to inspect the | ||
metadata. This is similar to how prior ``data-`` attribute additions | ||
expect existing tools to operate. | ||
|
||
|
||
Rejected Ideas | ||
============== | ||
|
||
Put metadata content on the project page | ||
---------------------------------------- | ||
|
||
Since tools generally only need to dependency information from a | ||
distribution in addition to what's already available on the project | ||
page, it was proposed that repositories may directly include the | ||
information on the project page, like the ``data-requires-python`` | ||
attribute specified in :pep:`503`. | ||
|
||
This approach was abandoned since a distribution may contain | ||
arbitrarily long lists of dependencies (including required and | ||
optional), and it is unclear whether including the information for | ||
every distribution in a project would result in net savings since the | ||
information for most distributions generally ends up unneeded. By | ||
serving the metadata separately, performance can be better estimated | ||
since data usage will be more proportional to the number of | ||
distributions inspected. | ||
|
||
|
||
Expose more files in the distribution | ||
------------------------------------- | ||
|
||
It was proposed to provide the entire ``.dist-info`` directory as a | ||
separate part, instead of only the metadata file. However, searving | ||
multiple files in one entity through HTTP requires re-archiving them | ||
separately after they are extracted from the original distribution | ||
by the repository server, and there are no current use cases for files | ||
other than ``METADATA`` when the distribution itself is not going to | ||
be installed. | ||
|
||
It should also be noted that the approach taken here does not | ||
preclude other files from being introduced in the future, whether we | ||
want to serve them together or individually. | ||
|
||
|
||
Require the metadata file to live alongside the distribution file | ||
----------------------------------------------------------------- | ||
|
||
It was proposed that the location to fetch metadata can be inferred | ||
implicitly instead, similarly to how :pep:`503` designates the GPG | ||
signature's location. However, since an attribute is required either | ||
way to indicate whether a distribution has static metadata, the author | ||
feels it is simpler to explicitly encode the location information in | ||
the attribute instead. This also makes future extension easier if we | ||
decide to expose more files in the distribution; instead of coming up | ||
with a location inference rule for each file added, we will only need | ||
to add an additional attribute. | ||
|
||
|
||
References | ||
========== | ||
|
||
.. [core-metadata] https://packaging.python.org/specifications/core-metadata/ | ||
|
||
.. [dist-info] https://packaging.python.org/specifications/recording-installed-packages/ | ||
|
||
.. [wheel] https://packaging.python.org/specifications/binary-distribution-format/ | ||
|
||
.. [sdist] https://packaging.python.org/specifications/source-distribution-format/ | ||
|
||
|
||
Copyright | ||
========= | ||
|
||
This document is placed in the public domain or under the | ||
CC0-1.0-Universal license, whichever is more permissive. | ||
|
||
|
||
.. | ||
Local Variables: | ||
mode: indented-text | ||
indent-tabs-mode: nil | ||
sentence-end-double-space: t | ||
fill-column: 70 | ||
coding: utf-8 | ||
End: |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put @pfmoore here, and in code owners?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure whether Paul or Donald is better to put in which fields TBH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is package index stuff, @dstufft is PEP-delegate by default. He can get someone else to handle it if he prefers, and I'd be willing to do it if he wants to pass the job on.
I'd rather not be sponsor, TBH. With the new CODEOWNERS workflow, it looks like that means I'd now be responsible for merging any PRs to the PEP and I don't really have the time to handle that right now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to remain as chief button-pusher to merge uncontroversial PRs :). I think the main point of the CODEOWNERS change is to make it easier for sponsors to stay up to date on changes to their PEPs, but they don't necessarily have to take care of all changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Donald is gonna be the delegate IIUC, based on the SC's standing delegation for Package Index stuff.
Sponsor is... well, whoever ends up sponsoring this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you