PEP: 621 Title: Storing project metadata in pyproject.toml Author: Brett Cannon <[email protected]>,
Dustin Ingram <[email protected]>, Paul Ganssle <paul at ganssle.io>, Pradyun Gedam <[email protected]>, Sébastien Eustace <[email protected]>, Thomas Kluyver <[email protected]>, Tzu-ping Chung <[email protected]>
Discussions-To: https://discuss.python.org/t/pep-621-round-3/5472 Status: Final Type: Standards Track Content-Type: text/x-rst Created: 22-Jun-2020 Post-History: 22-Jun-2020,
18-Oct-2020, 24-Oct-2020, 31-Oct-2020
Resolution: https://discuss.python.org/t/pep-621-round-3/5472/109
This PEP specifies how to write a project's core metadata in a
pyproject.toml
file for packaging-related tools to consume.
The key motivators of this PEP are:
- Encourage users to specify core metadata statically for speed, ease of specification, unambiguity, and deterministic consumption by build back-ends
- Provide a tool-agnostic way of specifying metadata for ease of learning and transitioning between build back-ends
- Allow for more code sharing between build back-ends for the "boring parts" of a project's metadata
To speak specifically to the motivation for static metadata, that has been an overall goal of the packaging ecosystem for some time. As such, making it easy to specify metadata statically is important. This also means that raising the cost of specifying data as dynamic is acceptable as users should skew towards wanting to provide static metadata.
Requiring the distinction between static and dynamic metadata also helps with disambiguation for when metadata isn't specified. When any metadata may be dynamic, it means you never know if the absence of metadata is on purpose or because it is to be provided later. By requiring that dynamic metadata be specified, it disambiguates the intent when metadata goes unspecified.
This PEP does not attempt to standardize all possible metadata required by a build back-end, only the metadata covered by the core metadata specification which are very common across projects and would stand to benefit from being static and consistently specified. This means build back-ends are still free and able to innovate around patterns like how to specify the files to include in a wheel. There is also an included escape hatch for users and build back-ends to use when they choose to partially opt-out of this PEP (compared to opting-out of this PEP entirely, which is also possible).
This PEP is also not trying to change the underlying core metadata in any way. Such considerations should be done in a separate PEP which may lead to changes or additions to what this PEP specifies.
The design guidelines the authors of this PEP followed were:
- Define a representation of as much of the core metadata in
pyproject.toml
as is reasonable - Define the metadata statically with an escape hatch for those who want to define it dynamically later via a build back-end
- Use familiar names where it makes sense, but be willing to use more modern terminology
- Try to be ergonomic within a TOML file instead of mirroring how build back-ends specify metadata at a low-level when it makes sense
- Learn from other build back-ends in the packaging ecosystem which have used TOML for their metadata
- Don't try to standardize things which lack a pre-existing standard at a lower-level
- When metadata is specified using this PEP, it is considered canonical
When specifying project metadata, tools MUST adhere and honour the metadata as specified in this PEP. If metadata is improperly specified then tools MUST raise an error to notify the user about their mistake.
Data specified using this PEP is considered canonical. Tools CANNOT
remove, add or change data that has been statically specified. Only
when a field is marked as dynamic
may a tool provide a "new" value.
Tools MUST specify fields defined by this PEP in a table named
[project]
. No tools may add fields to this table which are not
defined by this PEP or subsequent PEPs. For tools wishing to store
their own settings in pyproject.toml
, they may use the [tool]
table as defined in PEP 518. The lack of a [project]
table
implicitly means the build back-end will dynamically provide all
fields.
- Format: string
- Core metadata:
Name
(link) - Synonyms
The name of the project.
Tools MUST require users to statically define this field.
Tools SHOULD normalize this name, as specified by PEP 503, as soon as it is read for internal consistency.
- Format: string
- Core metadata:
Version
(link) - Synonyms
The version of the project as supported by PEP 440.
Users SHOULD prefer to specify already-normalized versions.
- Format: string
- Core metadata:
Summary
(link) - Synonyms
- Flit: N/A
- Poetry:
description
(link) - Setuptools:
description
(link)
The summary description of the project.
- Format: String or table
- Core metadata:
Description
(link) - Synonyms
The full description of the project (i.e. the README).
The field accepts either a string or a table. If it is a string then
it is the relative path to a text file containing the full
description. Tools MUST assume the file's encoding is UTF-8. If the
file path ends in a case-insensitive .md
suffix, then tools MUST
assume the content-type is text/markdown
. If the file path ends in
a case-insensitive .rst
, then tools MUST assume the content-type
is text/x-rst
. If a tool recognizes more extensions than this PEP,
they MAY infer the content-type for the user without specifying this
field as dynamic
. For all unrecognized suffixes when a
content-type is not provided, tools MUST raise an error.
The readme
field may also take a table. The file
key has a
string value representing a relative path to a file containing the
full description. The text
key has a string value which is the
full description. These keys are mutually-exclusive, thus tools MUST
raise an error if the metadata specifies both keys.
A table specified in the readme
field also has a content-type
field which takes a string specifying the content-type of the full
description. A tool MUST raise an error if the metadata does not
specify this field in the table. If the metadata does not specify the
charset
parameter, then it is assumed to be UTF-8. Tools MAY
support other encodings if they choose to. Tools MAY support
alternative content-types which they can transform to a content-type
as supported by the core metadata. Otherwise tools MUST raise an
error for unsupported content-types.
- Format: string
- Core metadata:
Requires-Python
(link) - Synonyms
The Python version requirements of the project.
- Format: Table
- Core metadata:
License
(link) - Synonyms
The table may have one of two keys. The file
key has a string
value that is a relative file path to the file which contains the
license for the project. Tools MUST assume the file's encoding is
UTF-8. The text
key has a string value which is the license of the
project whose meaning is that of the License
field from the
core metadata. These keys are mutually exclusive, so a tool MUST
raise an error if the metadata specifies both keys.
A practical string value for the license
key has been purposefully
left out to allow for a future PEP to specify support for SPDX
expressions (the same logic applies to any sort of "type" field
specifying what license the file
or text
represents).
- Format: Array of inline tables with string keys and values
- Core metadata:
Author
/Author-email
/Maintainer
/Maintainer-email
(link) - Synonyms
The people or organizations considered to be the "authors" of the project. The exact meaning is open to interpretation — it may list the original or primary authors, current maintainers, or owners of the package.
The "maintainers" field is similar to "authors" in that its exact meaning is open to interpretation.
These fields accept an array of tables with 2 keys: name
and
email
. Both values must be strings. The name
value MUST be a
valid email name (i.e. whatever can be put as a name, before an email,
in RFC 822) and not contain commas. The email
value MUST be a
valid email address. Both keys are optional.
Using the data to fill in core metadata is as follows:
- If only
name
is provided, the value goes inAuthor
/Maintainer
as appropriate. - If only
email
is provided, the value goes inAuthor-email
/Maintainer-email
as appropriate. - If both
email
andname
are provided, the value goes inAuthor-email
/Maintainer-email
as appropriate, with the format{name} <{email}>
(with appropriate quoting, e.g. usingemail.headerregistry.Address
). - Multiple values should be separated by commas.
- Format: array of strings
- Core metadata:
Keywords
(link) - Synonyms
The keywords for the project.
- Format: array of strings
- Core metadata:
Classifier
(link) - Synonyms
Trove classifiers which apply to the project.
- Format: Table, with keys and values of strings
- Core metadata:
Project-URL
(link) - Synonyms
A table of URLs where the key is the URL label and the value is the URL itself.
- Format: Table (
[project.scripts]
,[project.gui-scripts]
, and[project.entry-points]
) - Core metadata: N/A; Entry points specification
- Synonyms
There are three tables related to entry points. The
[project.scripts]
table corresponds to the console_scripts
group in the entry points specification. The key of the table is the name of the
entry point and the value is the object reference.
The [project.gui-scripts]
table corresponds to the gui_scripts
group in the entry points specification. Its format is the same as
[project.scripts]
.
The [project.entry-points]
table is a collection of tables. Each
sub-table's name is an entry point group. The key and value semantics
are the same as [project.scripts]
. Users MUST NOT create
nested sub-tables but instead keep the entry point groups to only one
level deep.
Build back-ends MUST raise an error if the metadata defines a
[project.entry-points.console_scripts]
or
[project.entry-points.gui_scripts]
table, as they would
be ambiguous in the face of [project.scripts]
and
[project.gui-scripts]
, respectively.
- Format: Array of PEP 508 strings (
dependencies
) and a table with values of arrays of PEP 508 strings (optional-dependencies
) - Core metadata:
Requires-Dist
andProvides-Extra
(link, link) - Synonyms
- Flit:
requires
for required dependencies,requires-extra
for optional dependencies (link) - Poetry:
[tool.poetry.dependencies]
for dependencies (both required and for development),[tool.poetry.extras]
for optional dependencies (link) - Setuptools:
install_requires
for required dependencies,extras_require
for optional dependencies (link)
- Flit:
The (optional) dependencies of the project.
For dependencies
, it is a key whose value is an array of strings.
Each string represents a dependency of the project and MUST be
formatted as a valid PEP 508 string. Each string maps directly to
a Requires-Dist
entry in the core metadata.
For optional-dependencies
, it is a table where each key specifies
an extra and whose value is an array of strings. The strings of the
arrays must be valid PEP 508 strings. The keys MUST be valid values
for the Provides-Extra
core metadata. Each value in the array
thus becomes a corresponding Requires-Dist
entry for the matching
Provides-Extra
metadata.
- Format: Array of strings
- Core metadata: N/A
- No synonyms
Specifies which fields listed by this PEP were intentionally unspecified so another tool can/will provide such metadata dynamically. This clearly delineates which metadata is purposefully unspecified and expected to stay unspecified compared to being provided via tooling later on.
- A build back-end MUST honour statically-specified metadata (which
means the metadata did not list the field in
dynamic
). - A build back-end MUST raise an error if the metadata specifies the
name
indynamic
. - If the core metadata specification lists a field as "Required",
then the metadata MUST specify the field statically or list it in
dynamic
(build back-ends MUST raise an error otherwise, i.e. it should not be possible for a required field to not be listed somehow in the[project]
table). - If the core metadata specification lists a field as "Optional",
the metadata MAY list it in
dynamic
if the expectation is a build back-end will provide the data for the field later. - Build back-ends MUST raise an error if the metadata specifies a
field statically as well as being listed in
dynamic
. - If the metadata does not list a field in
dynamic
, then a build back-end CANNOT fill in the requisite metadata on behalf of the user (i.e.dynamic
is the only way to allow a tool to fill in metadata and the user must opt into the filling in). - Build back-ends MUST raise an error if the metadata specifies a field in dynamic but the build back-end was unable to provide the data for it.
[project] name = "spam" version = "2020.0.0" description = "Lovely Spam! Wonderful Spam!" readme = "README.rst" requires-python = ">=3.8" license = {file = "LICENSE.txt"} keywords = ["egg", "bacon", "sausage", "tomatoes", "Lobster Thermidor"] authors = [ {email = "[email protected]"}, {name = "Tzu-Ping Chung"} ] maintainers = [ {name = "Brett Cannon", email = "[email protected]"} ] classifiers = [ "Development Status :: 4 - Beta", "Programming Language :: Python" ] dependencies = [ "httpx", "gidgethub[httpx]>4.0.0", "django>2.1; os_name != 'nt'", "django>2.0; os_name == 'nt'" ] [project.optional-dependencies] test = [ "pytest < 5.0.0", "pytest-cov[all]" ] [project.urls] homepage = "https://example.com" documentation = "https://readthedocs.org" repository = "https://github.com" changelog = "https://github.com/me/spam/blob/master/CHANGELOG.md" [project.scripts] spam-cli = "spam:main_cli" [project.gui-scripts] spam-gui = "spam:main_gui" [project.entry-points."spam.magical"] tomatoes = "spam:main_tomatoes"
As this provides a new way to specify a project's core metadata and is using a new table name which falls under the reserved namespace as outlined in PEP 518, there are no backwards-compatibility concerns.
There are no direct security concerns as this PEP covers how to statically define project metadata. Any security issues would stem from how tools consume the metadata and choose to act upon it.
There are currently no proofs-of-concept from any build back-end implementing this PEP.
There was worry that using this table name would exacerbate confusion
between build metadata and project metadata, e.g. by using
[build-system.metadata]
as a table.
Garnered no strong support.
The strongest contender after [project]
, but in the end it was
agreed that [project]
read better for certain sub-tables, e.g.
[project.urls]
.
Initially there was a proposal to add a middle layer between the
static metadata specified by this PEP and
prepare_metadata_for_build_wheel()
as specified by PEP 517. The
idea was that if a project wanted to insert itself between a build
back-end and the metadata there would be a hook to do so.
In the end the authors considered this idea unnecessarily complicated and would move the PEP away from its design goal to push people to define core metadata statically as much as possible.
While it would make things easier for tools to only work with the normalized name as specified in PEP 503, the idea was ultimately rejected as it would hurt projects transitioning to using this PEP.
The authors decided fairly quickly during design discussions that this PEP should focus exclusively on project metadata and not build metadata. As such, specifying what files should end up in a source distribution or wheel file is out of scope for this PEP.
This suggestion came thanks to the corresponding core metadata
being Project-Url
. But once the overall table name of [project]
was chosen, the redundant use of the word "project" suggested the
current, shorter name was a better fit.
While the core metadata supports it, having a single field for a project's URL while also supporting a full table seemed redundant and confusing.
As various tools have grown the concept of required dependencies versus development dependencies, the idea of suggesting to tools that they put such development tool into a "dev" grouping came up. In the end, though, the authors deemed it out-of-scope for this specification to suggest such a workflow.
The authors considered the idea that the dynamic
field would only
require the listing of missing required fields and make listing
optional fields optional. In the end, though, this went against the
design goal of promoting specifying as much information statically as
possible.
The readme
field had a proposed readme_content_type
field, but
the authors considered the string/table hybrid more practical for the
common case while still accommodating the more complex case. Same goes
for using long_description
and a corresponding
long_description_content_type
field.
The file
key in the table format was originally proposed as
path
, but file
corresponds to setuptools' file
key and
there is no strong reason otherwise to choose one over the other.
The authors considered allowing for unspecified content-types which
would default to text/plain
, but decided that it would be best to
be explicit in this case to prevent accidental incorrect renderings on
PyPI and to force users to be clear in their intent.
The authors originally proposed requires
/extra-requires
as
names, but decided to go with the current names after a survey of
other packaging ecosystems showed Python was an outlier:
Normalizing on the current names helps minimize confusion for people coming from
other ecosystems without using terminology that is necessarily foreign to new
programmers. It also prevents potential confusion with requires
in the
[build-system]
table as specified in PEP 518.
As the difference between Authors
and Maintainers
fields in
the core metadata is unspecified and ambiguous, this PEP originally
proposed unifying them as a single authors
field. Other ecosystems
have selected "author" as the term to use, so the thinking was to
standardize on Author
in the core metadata as the place to list
people maintaining a project.
In the end, though, the decision to adhere to the core metadata was deemed more important to help with the acceptance of this PEP, rather than trying to introduce a new interpretation for some of the core metadata.
There was a worry that keeping project.entry-points
to a depth of 1 for sub-tables
would cause confusion to users if they use a dotted name and are not used to table
names using quotation marks (e.g. project.entry-points."spam.magical"
). But
supporting an arbitrary depth -- e.g. project.entry-points.spam.magical
-- would
preclude any form of an exploded table format in the future. It would also complicate
things for build back-ends as they would have to make sure to traverse the full
table structure rather than a single level and raising errors as appropriate on
value types.
The format for specifying the dependencies of a project was the most hotly contested topic in terms of data format. It led to the creation of both PEP 631 and PEP 633 which represent what is in this PEP and using TOML dictionaries more extensively, respectively. The decision on those PEPs can be found at https://discuss.python.org/t/how-to-specify-dependencies-pep-508-strings-or-a-table-in-toml/5243/38.
The authors briefly considered supporting both formats, but decided that it would lead to confusion as people would need to be familiar with two formats instead of just one.
When this PEP was written, sdists did not require having static,
canonical metadata like this PEP does. The idea was then considered to
use this PEP as a way to get such metadata into sdists. In the end,
though, the idea of updating pyproject.toml
was not generally
liked, and so the idea was rejected in favour of separately pursuing
standardizing metadata in sdists.
In an earlier version of this PEP, tools were allowed to extend data for fields. For instance, build back-ends could take the version number and add a local version for when they built the wheel. Tools could also add more trove classifiers for things like the license or supported Python versions.
In the end, though, it was thought better to start out stricter and contemplate loosening how static the data could be considered based on real-world usage.
None at the moment.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.