diff --git a/pep-0639.rst b/pep-0639.rst index 48cfeb7c655..b781a5cecf0 100644 --- a/pep-0639.rst +++ b/pep-0639.rst @@ -1,543 +1,2322 @@ PEP: 639 -Title: Metadata for Python Software Packages 2.2 +Title: Improving License Clarity with Better Package Metadata Version: $Revision$ Last-Modified: $Date$ -Author: Philippe Ombredanne +Author: Philippe Ombredanne , + C.A.M. Gerlach Sponsor: Paul Moore -BDFL-Delegate: Paul Moore -Discussions-To: https://discuss.python.org/t/2154 +PEP-Delegate: Paul Moore +Discussions-To: https://discuss.python.org/t/12622 Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Aug-2019 -Python-Version: 3.x Post-History: -Replaces: 566 Resolution: Abstract ======== -This PEP describes the changes between versions 2.1 and 2.2 of the `Core -Metadata Specification` [#cms]_ for Python packages. Version 2.1 is specified in -PEP 566. +This PEP defines a specification for how licenses are documented in the +`core metadata <#coremetadataspec_>`_, +with `license expression strings `_ using +`SPDX identifiers <#spdxid_>`_ in a new ``License-Expression`` field. +This will make license declarations simpler and less ambiguous for +package authors to create, end users to read and understand, and +tools to programatically process. -The primary change introduced in this PEP updates how licenses are documented in -core metadata via the ``License`` field with license expression strings using -SPDX license identifiers [#spdxlist]_ such that license documentation is simpler -and less ambiguous: +The PEP also: -- for package authors to create, -- for package users to read and understand, and, -- for tools to process package license information mechanically. +- `Formally specifies `_ a new ``License-File`` field, + and defines how license files should be + `included in distributions `_, + as already used by the Wheel and Setuptools projects. -The other changes include: +- `Deprecates `_ the legacy ``License`` field + and ``license ::`` classifiers. -- specifying a ``License-File`` field which is already used by ``wheel`` and - ``setuptools`` to include license files in built distributions. -- defining how tools can validate license expressions and report warnings to - users for invalid expressions (but still accept any string as ``License``). +- `Adds and deprecates `_ the corresponding keys + in the PEP 621 project source metadata format. +- `Provides clear guidance `_ for authors and + tools converting legacy license metadata, adding license files and + validating license expressions. -Goals -===== +- Discusses `user scenarios `_, + describes a `reference implementation `_, + analyzes numerous `potential alternatives `_, + includes `detailed examples `_ and + surveys license documentation + `in Python packaging `_ and + `other ecosystems `_. -This PEP's scope is limited strictly to how we document the license of a -distribution: +The changes in this PEP will update the +`core metadata <#coremetadataspec>`_ to version 2.3, modify the +`PEP 621 project metadata specification <#pep621spec_>`_, +and make minor additions to the `source distribution (sdist) <#sdistspec_>`_, +`built distribution (wheel) <#wheelspec_>`_ and +`installed project <#installedspec_>`_ standards. -- with an improved and structured way to document a license expression, and, -- by including license texts in a built package. -The core metadata specification updates that are part of this PEP have been -designed to have minimal impact and to be backward compatible with v2.1. These -changes utilize emerging new ways to document licenses that are already in use -in some tools (e.g. by adding the ``License-File`` field already used in -``wheel`` and ``setuptools``) or by some package authors (e.g. storing an SPDX -license expression in the existing ``License`` field). +Goals +===== + +This PEP's scope is limited to how we document the license of a +distribution package, specifically covering: -In addition to an update to the metadata specification, this PEP contains: +- An improved and structured way to include a license expression. +- A formal mechanism to include license texts in a built distribution (wheel). -- recommendations for publishing tools on how to validate the ``License`` and - ``Classifier`` fields and report informational warnings when a package uses an - older, non-structured style of license documentation conventions. +The changes to the core metadata specification that this PEP requires have been +designed to minimize impact and maximize backward compatibility. +This specification builds off of existing ways to document licenses that are +already in use in popular tools (e.g. adding support to core metadata for +the ``License-File`` field `already used `_ in +the Wheel and Setuptools projects) and by some package authors (e.g. storing an +SPDX license expression in the existing ``License`` field). -- informational appendices that contain surveys of how we document licenses - today in Python packages and elsewhere, and a reference Python library to - parse, validate and build correct license expressions. +In addition to these proposed changes, this PEP contains guidance for tools +handling and converting these metadata, a tutorial for package authors +covering various common use cases, detailed examples of them in use, +and a comprehensive survey of license documentation in Python and other +languages. -It is the intent of the PEP authors to work closely with tool authors to -implement the recommendations for validation and warnings specified in this PEP. +It is the intent of the PEP authors to work closely with tool maintainers to +implement the recommendations for validation and warnings specified here. Non-Goals ========= -This PEP is neutral regarding the choice of license by any package author. +This PEP is neutral regarding the choice of license by any particular +package author. This PEP makes no recommendation for specific licenses, +and does not require the use of a particular license documentation convention. -In particular, the SPDX license expression syntax proposed in this PEP provides -simpler and more expressive conventions to document accurately any kind of -license that applies to a Python package, whether it is an open source license, -a free or libre software license, a proprietary license, or a combination of -such licenses. +Rather, the SPDX license expression syntax proposed in this PEP provides a +simpler and more expressive mechanism to accurately document any kind of +license that applies to a Python package, whether it is open source, +free/libre, proprietary, or a combination of such. -This PEP makes no recommendation for specific licenses and does not require the -use of specific license documentation conventions. This PEP also does not impose -any restrictions when uploading to PyPI. +This PEP also does not impose any additional restrictions when uploading to +PyPI, unless projects choose to make use of the new fields. -Instead, this PEP is intended to document common practices already in use, and -recommends that publishing tools should encourage users via informational -warnings when they do not follow this PEP's recommendations. +Instead, it is intended to document best practices already in use, extend them +to use a new formally-specified and supported mechanism, and provide guidance +for packaging tools on how to hand the transition and inform users accordingly. -This PEP is not about license documentation in files inside packages, even -though this is a surveyed topic in the appendix. +This PEP also is not about license documentation in files inside projects, +though this is a `surveyed topic `_ in the appendix, +and nor does it intend to cover cases where the source and +binary distribution packages don't have +`the same licenses `_. Possible future PEPs -------------------- It is the intention of the authors of this PEP to consider the submission of -related but separate PEPs in the future such as: +related but separate PEPs in the future, which may include: + +- Removing the deprecated ``License`` field and license classifiers + from the core metadata specification. -- make ``License`` and new ``License-File`` fields mandatory including - stricter enforcement in tools and PyPI publishing. +- Making the ``License-Expression`` and ``License-File`` fields mandatory + for publishing tools and PyPI packages. -- require uploads to PyPI to use only FOSS (Free and Open Source software) +- Requiring uploads to PyPI to use only Free and Open Source Software (FOSS) licenses. Motivation ========== -Software is licensed, and providing accurate licensing information to Python -packages users is an important matter. Today, there are multiple places where -licenses are documented in package metadata and there are limitations to what -can be documented. This is often leading to confusion or a lack of clarity both -for package authors and package users. - -Several package authors have expressed difficulty and/or frustrations due to the -limited capabilities to express licensing in package metadata. This also applies -to Linux and BSD* distribution packagers. This has triggered several -license-related discussions and issues, in particular: - -- https://github.com/pypa/trove-classifiers/issues/17 -- https://github.com/pypa/interoperability-peps/issues/46 -- https://github.com/pypa/packaging-problems/issues/41 -- https://github.com/pypa/wheel/issues/138 -- https://github.com/pombredanne/spdx-pypi-pep/issues/1 - -On average, Python packages tend to have more ambiguous, or missing, license -information than other common application package formats (such as npm, Maven or -Gem) as can be seen in the statistics [#cdstats]_ page of the ClearlyDefined -[#cd]_ project that cover all packages from PyPI, Maven, npm and Rubygems. -ClearlyDefined is an open source project to help improve clarity of other open -source projects that is incubating at the OSI (Open Source Initiative) [#osi]_. +All software is licensed, and providing accurate license information to Python +package users is an important matter. Today, there are multiple fields where +licenses are documented in core metadata, and there are limitations to what +can be expressed in each of them. This often leads to confusion and a lack of +clarity, both for package authors and end users. + +Many package authors have expressed difficulty and frustrations due to the +limited capabilities to express licensing in project metadata, and this +creates further trouble for Linux and BSD distribution re-packagers. +This has triggered a number of license-related discussions and issues, +including on `outdated and ambiguous PyPI classifiers <#classifierissue_>`_, +`license interoperability with other ecosystems <#interopissue_>`_, +`too many confusing license metadata options <#packagingissue_>`_, +`limited support for license files in the Wheel project <#wheelfiles_>`_, and +`the lack of clear, precise and standardized license metadata <#pepissue_>`_. + +On average, Python packages tend to have more ambiguous and missing license +information than other common ecosystems (such as npm, Maven or +Gem). This is supported by the `statistics page <#cdstats_>`_ of the +`ClearlyDefined project <#clearlydefined_>`_, an +`Open Source Initiative <#osi_>`_ incubated effort to help +improve licensing clarity of other FOSS projects, covering all packages +from PyPI, Maven, npm and Rubygems. Rationale ========= -A mini-survey of existing license metadata definitions in use in the Python -ecosystem today and documented in several other system/distro and application -package formats is provided in Appendix 2 of this PEP. +A survey of existing license metadata definitions in use in the Python +ecosystem today is provided in +`Appendix 2 `_ of this PEP, +and license documentation in a variety of other packaging systems, +Linux distros, languages ecosystems and applications is surveyed in +`Appendix 3 `_. There are a few takeaways from the survey: - Most package formats use a single ``License`` field. -- Many modern package formats use some form of license expression syntax to - optionally combine more than one license identifier together. SPDX and - SPDX-like syntaxes are the most popular in use. +- Many modern package systems use some form of license expression syntax to + optionally combine more than one license identifier together. + SPDX and SPDX-like syntaxes are the most popular in use. -- SPDX license identifiers are becoming a de facto way to reference common - licenses everywhere, whether or not a license expression syntax is used. +- SPDX license identifiers are becoming the de facto way to reference common + licenses everywhere, whether or not a full license expression syntax is used. - Several package formats support documenting both a license expression and the - paths of the corresponding files that contain the license text. Most free and - open source software licenses require package authors to include their full + paths of the corresponding files that contain the license text. Most Free and + Open Source Software licenses require package authors to include their full text in a distribution. These considerations have guided the design and recommendations of this PEP. -The reuse of the ``License`` field with license expressions will provide an -intuitive and more structured way to express the license of a distribution using -a well-defined syntax and well-known license identifiers. +The current license classifiers cover some common cases, and could +theoretically be extended to include the full range of current SPDX +identifiers while deprecating the many ambiguous classifiers (including some +extremely popular and particularly problematic ones, such as +``License :: OSI Approved :: BSD License``). However, this both requires a +substantial amount of effort to duplicate the SPDX license list and keep +it in sync, and is effectively a hard break in backward compatibility, +forcing a huge proportion of package authors to immediately update to new +classifiers (in most cases, with many possible choices that require closely +examining the project's license) immediately when PyPI deprecates the old ones. + +Furthermore, this only covers simple packages entirely under a single license; +it doesn't address the substantial fraction of common projects that vendor +dependencies (e.g. Setuptools), offer a choice of licenses (e.g. Packaging) +or were relicensed, adapt code from other projects or contain fonts, images, +examples, binaries or other assets under other licenses. It also requires +both authors and tools understand and implement the PyPI-specific bespoke +classifier system, rather than using short, easy to add and standardized +SPDX identifiers in a simple text field, as increasingly widely adopted by +most other packaging systems to reduce the overall burden on the ecosystem. +Finally, this does not provide as clear an indicator that a package +has adopted the new system, and should be treated accordingly. + +The use of a new ``License-Expression`` field will provide an intuitive, +structured and unambiguous way to express the license of a +package using a well-defined syntax and well-known license identifiers. +Similarly, a formally-specified ``License-File`` field offers a standardized +way to ensure that the full text of the license(s) are included with the +package when distributed, as legally required, and allows other tools consuming +the core metadata to unambiguously locate a distribution's license files. + +Over time, encouraging the use of these fields and deprecating the ambiguous, +duplicative and confusing legacy alternatives will help Python software +publishers improve the clarity, accuracy and portability of their licensing +practices, to the benefit of package authors, consumers and redistributors +alike. + + +Terminology +=========== + +This PEP seeks to clearly define the terms it uses, specifically those that: + +- Have multiple established meanings (e.g. import vs. distribution package, + wheel *format* vs. Wheel *project*). + +- Are related and often used interchangeably, but have critical + distinctions in meaning (e.g. PEP 621 *key* vs. core metadata *field*, + a point of apparent confusion in PEP 621 with significant effects on this + PEP). + +- Are existing concepts that don't have formal terms/definitions + (e.g. project/source metadata vs. distribution/built metadata, + build vs. publishing tools). + +- Are new concepts introduced here (e.g. license expression/identifier). + +Whenever available, definitions are excerpted from the +`PyPA PyPUG Glossary <#pypugglossary_>`_ and `SPDX <#spdx_>`_. Terms are listed +here in their full versions; related words (``Rel:``) are in parenthesis, +including short forms (``Short:``), sub-terms (``Sub:``) and common synonyms +for the purposes of this PEP (``Syn:``). + +**Built Distribution** *(Syn: Binary Distribution/Wheel)* + A Distribution format containing files and metadata that only need to be + moved to the correct location on the target system to be installed. + Wheel is such a format, whereas distutil's *[sic]* Source Distribution + is not. + *(PyPUG Glossary)* + + For the purposes of this PEP, except where noted, this is synonymous + with **binary distribution** (a built distribution containing compiled code) + and **wheel** (the format). + +**Core Metadata** *(Syn: Package Metadata, Sub: Distribution Metadata)* + The `PyPA specification <#coremetadataspec_>`_ and the set of metadata fields + it defines that describe key static attributes of distribution packages + and installed projects. + + **Distribution metadata** refers to, more specifically, the concrete form + core metadata takes when included inside a distribution archive + (``PKG-INFO`` in a sdist and ``METADATA`` in a wheel) or installed project + (``METADATA``). + +**Core Metadata Field** *(Short: Metadata Field/Field)* + A single key-value pair, or sequence of such with the same key, as defined + by the core metadata specification. Notably, *not* a PEP 621 project + metadata format key. + +**Distribution Package** *(Sub: Package, Distribution Archive)* + A versioned archive file that contains Python packages, modules, and other + resource files that are used to distribute a Release. + *(PyPUG Glossary)* + + In this PEP, **package** is used to refer to the abstract concept of a + distributable form of a Python project, while **distribution** more + specifically references the physical **distribution archive**. + +**License Classifier** + A `PyPI Trove classifier <#classifiers_>`_ (as originally defined in PEP 301) + which begins with ``License ::``, currently used to indicate a project's + license status by including it as a ``Classifier`` in the core metadata. + +**License Expression** *(Syn: SPDX Expression)* + A string with valid `SPDX license expression syntax <#spdxpression_>`_ + including any SPDX license identifiers as defined here, which describes + a project's license(s) and how they relate to one another. Examples: + ``GPL-3.0-or-later``, ``MIT AND (Apache-2.0 OR BSD-2-clause)`` + +**License Identifier** *(Syn: License ID/SPDX Identifier)* + A valid `SPDX short-form license identifier <#spdxid_>`_, as described in the + `Add License-Expression field`_ section of this PEP; briefly, + this includes all valid SPDX identifiers and the ``LicenseRef-Public-Domain`` + and ``LicenseRef-Proprietary`` strings. Examples: ``MIT``, ``GPL-3.0-only`` + +**Project** *(Sub: Project Source Tree, Installed Project)* + A library, framework, script, plugin, application, collection of data + or other resources, or some combination thereof that is intended to be + packaged into a Distribution. Generally contains a ``pyproject.toml``, + ``setup.py``, or ``setup.cfg`` file at the root of the project source + directory. + *(PyPUG Glossary)* + + Here, a **project source tree** refers to the on-disk format of + a project used for development, while an **installed project** is the form a + project takes once installed from a distribution, as + `specified by PyPA <#installedspec_>`_. + +**Project Source Metadata** *(Sub: PEP 621 Metadata, Key, Subkey)* + Core metadata defined by the package author in the project source tree, + as top-level keys in the ``[project]`` table of a PEP 621 ``pyproject.toml``, + in the ``[metadata]`` table of ``setup.cfg``, or the equivalent for other + build tools. + + The **PEP 621 metadata** refers specifically to the former, as defined by the + `PyPA Declaring Project Metadata specification <#pep621spec_>`_. + A **PEP 621 metadata key**, or an unqualified *key* refers specifically to + a top-level ``[project]`` key (notably, *not* a core metadata *field*), + while a **subkey** refers to a second-level key in a table-valued + PEP 621 key. + +**Root License Directory** *(Short: License Directory)* + The directory under which license files are stored in a project/distribution + and the root directory that their paths, as recorded under the + ``License-File`` core metadata fields, are relative to. + Defined here to be the project root directory for source trees and source + distributions, and a subdirectory named ``license_files`` of the directory + containing the core metadata (i.e., the ``.dist-info/license_files`` + directory) for built distributions and installed projects. + +**Source Distribution** *(Short: sdist)* + Here, specifically refers to a source distribution (**sdist**) as + `specified by PyPA <#sdistspec_>`_. + +**Tool** *(Sub: Packaging Tool, Build Tool, Install Tool, Publishing Tool)* + A program, script or service executed by the user or automatically that + seeks to conform to the specification defined in this PEP. + + A **packaging tool** refers to a tool used to build, publish, + install, or otherwise directly interact with Python packages. + + A **build tool** is a packaging tool used to generate a source or built + distribution from a project source tree or sdist, when directly invoked + as such (as opposed to by end-user-facing install tools). + Examples: Wheel project, PEP 517 backends via ``build`` or other + package-developer-facing frontends, calling ``setup.py`` directly. + + An **install tool** is a packaging tool used to install a source or built + distribution in a target environment. Examples include the PyPA pip and + ``installer`` projects. + + A **publishing tool** is a packaging tool used to upload distribution + archives to a package index, such as Twine for PyPI. + +**Wheel Format** *(Short: wheel, Rel: Wheel project)* + Here, **wheel**, the standard built distribution format introduced in PEP 427 + and `specified by PyPA <#wheelspec_>`_, will be referred to in lowercase, + while the `Wheel project <#wheelproject_>`_, its reference implementation, + will be referred to as **Wheel** in Title Case. + + +Specification +============= + +The changes necessary to implement the improved license handling outlined in +this PEP include those in both +`distribution package metadata `_, as defined in the +`core metadata specification <#coremetadataspec_>`_, and +`author-provided project source metadata `_, as +originally defined in PEP 621. + +Further, `minor additions `_ to the +source distribution (sdist), built distribution (wheel) and installed project +specifications will help document and clarify the already allowed, +now formally standardized behavior in these respects. +Finally, `guidance is established `_ +for tools handling and converting legacy license metadata to license +expressions, to ensure the results are consistent, correct and unambiguous. + +Note that the guidance on errors and warnings is for tools' default behavior; +they MAY operate more strictly if users explicitly configure them to do so, +such as by a CLI flag or a configuration option. + + +Core metadata +------------- + +The `PyPA Core Metadata specification <#coremetadataspec_>`_ defines the names +and semantics of each of the supported fields in the distribution metadata of +Python distribution packages and installed projects. + +This PEP `adds `_ the +``License-Expression`` field, +`adds `_ the ``License-File`` field, +`deprecates `_ the ``License`` field, +and `deprecates `_ the license classifiers +in the ``Classifier`` field. + +The error and warning guidance in this section applies to build and +publishing tools; end-user-facing install tools MAY be more lenient than +mentioned here when encountering malformed metadata +that does not conform to this specification. + +As it adds new fields, this PEP updates the core metadata to version 2.3. + + +Add ``License-Expression`` field +'''''''''''''''''''''''''''''''' + +The ``License-Expression`` optional field is specified to contain a text string +that is a valid SPDX license expression, as defined herein. + +Publishing tools SHOULD issue an informational warning if this field is +missing, and MAY raise an error. Build tools MAY issue a similar warning, +but MUST NOT raise an error. -Over time, recommending the usage of these expressions will help Python package -publishers improve the clarity of their license documentation to the benefit of -package authors, consumers and redistributors. +A license expression is a string using the SPDX license expression syntax as +documented in the `SPDX specification <#spdxpression_>`_, either +Version 2.2 or a later compatible version. +When used in the ``License-Expression`` field and as a specialization of +the SPDX license expression definition, a license expression can use the +following license identifiers: -Core Metadata Specification updates -=================================== +- Any SPDX-listed license short-form identifiers that are published in the + `SPDX License List <#spdxlist_>`_, version 3.15 or any later compatible + version. Note that the SPDX working group never removes any license + identifiers; instead, they may choose to mark an identifier as "deprecated". -The canonical source for the names and semantics of each of the supported -metadata fields is the Core Metadata Specification [#cms]_ document. +- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings, to + identify licenses that are not included in the SPDX license list. -The details of the updates considered to the Core Metadata Specification [#cms]_ -document as part of this PEP are described here and will be added to the -canonical source once this PEP is approved. +When processing the ``License-Expression`` field to determine if it contains +a valid license expression, build and publishing tools: +- SHOULD halt execution and raise an error if: -Added in Version 2.2 --------------------- + - The field does not contain a valid license expression -License-File (multiple use) -::::::::::::::::::::::::::: + - One or more license identifiers are not valid (as defined above) -The License-File is a string that is a path, relative to``.dist-info``, to a -license file. The license file content MUST be UTF-8 encoded text. +- SHOULD report an informational warning, and publishing tools MAY raise an + error, if one or more license identifiers have been marked as deprecated in + the `SPDX License List <#spdxlist_>`_. -Build tools SHOULD honor this field and include the corresponding license -file(s) in the built package. +- MUST store a case-normalized version of the ``License-Expression`` field + using the reference case for each SPDX license identifier and + uppercase for the ``AND``, ``OR`` and ``WITH`` keywords. +- SHOULD report an informational warning, and MAY raise an error if + the normalization process results in changes to the + ``License-Expression`` field contents. -Changed in Version 2.2 ----------------------- +For all newly-upload distributions that include a +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST +validate that it contains a valid, case-normalized license expression with +valid identifiers (as defined here) and MUST reject uploads that do not. +PyPI MAY reject an upload for using a deprecated license identifier, +so long as it was deprecated as of the above-mentioned SPDX License List +version. -License (optional) -:::::::::::::::::: -Text indicating the license covering the distribution. This text can be either a -valid license expression as defined here or any free text. +Add ``License-File`` field +'''''''''''''''''''''''''' -Publishing tools SHOULD issue an informational warning if this field is empty, -missing, or is not a valid license expression as defined here. Build tools MAY -issue a similar warning. +The ``License-File`` optional field is specified to contain the string +representation of the path to a license-related file, relative to the +root license directory. It is a multi-use field that may appear zero or +more times, each instance listing the path to one such file. Files specified +under this field could include license text, author/attribution information, +or other legal notices that need to be distributed with the package. +If a ``License-File`` is listed in a source or built distribution's core +metadata, that file MUST be included in the distribution at the specified path +relative to the root license directory, and MUST be installed with the +distribution at that same relative path. -License Expression syntax -''''''''''''''''''''''''' +The specified relative path MUST be consistent between project source trees, +source distributions (sdists), built distributions (wheels) and installed +projects. Therefore, inside the root license directory, packaging tools +MUST reproduce the directory structure under which the +source license files are located relative to the project root. -A license expression is a string using the SPDX license expression syntax as -documented in the SPDX specification [#spdx]_ using either Version 2.2 -[#spdx22]_ or a later compatible version. SPDX is a working group at the Linux -Foundation that defines a standard way to exchange package information. +Path separators MUST be the forward slash character (``/``), +and parent directory indicators (``..``) MUST NOT be used. +License file content MUST be UTF-8 encoded text. -When used in the ``License`` field and as a specialization of the SPDX license -expression definition, a license expression can use the following license -identifiers: +Build tools MAY and publishing tools SHOULD produce an informative warning +if a built distribution's metadata contains no ``License-File`` entries, +and publishing tools MAY but build tools MUST NOT raise an error. -- any SPDX-listed license short-form identifiers that are published in the SPDX - License List [#spdxlist]_ using either Version 3.10 or any later compatible - version. Note that the SPDX working group never removes any license - identifiers: instead they may choose to mark an identifier as "deprecated". +For all newly-uploaded distribution packages that include one or more +``License-File`` fields and declare a ``Metadata-Version`` of ``2.3`` or +higher, PyPI SHOULD validate that the specified files are present in all +uploaded distributions, and MUST reject uploads that do not validate. -- the ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings to - identify licenses that are not included in the SPDX license list. -When processing the ``License`` field to determine if it contains a valid -license expression, tools: +Deprecate ``License`` field +''''''''''''''''''''''''''' -- SHOULD report an informational warning if one or more of the following - applies: +The legacy unstructured-text ``License`` field is deprecated and replaced by +the new ``License-Expression`` field. Build and publishing tools MUST raise +an error if both these fields are present and their values are not identical, +including capitalization and excluding leading and trailing whitespace. - - the field does not contain a license expression, +If only the ``License`` field is present, such tools SHOULD issue a warning +informing users it is deprecated and recommending ``License-Expression`` +instead. - - the license expression syntax is invalid, +For all newly-uploaded distributions that include a +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST +reject any that specify a ``License`` field and the text of which is not +identical to that of ``License-Expression``, as defined in this section. - - the license expression syntax is valid but some license identifiers are - unknown as defined here or the license identifiers have been marked as - deprecated in the SPDX License List [#spdxlist]_ +Along with license classifiers, the ``License`` field may be removed from a +new version of the specification in a future PEP. -- SHOULD store a case-normalized version of the ``License`` field using the - reference case for each SPDX license identifier and uppercase for the AND, OR - and WITH keywords. -- SHOULD report an informational warning if normalization process results in - changes to the ``License`` field contents. +Deprecate license classifiers +''''''''''''''''''''''''''''' -License expression examples:: +Using license `classifiers <#classifiers_>`_ in the ``Classifier`` field +(described in PEP 301) is deprecated and replaced by the more precise +``License-Expression`` field. - License: MIT +If the ``License-Expression`` field is present, build tools SHOULD and +publishing tools MUST raise an error if one or more license classifiers +is included in a ``Classifier`` field, and MUST NOT add +such classifiers themselves. - License: BSD-3-Clause +Otherwise, if this field contains a license classifier, build tools MAY +and publishing tools SHOULD issue a warning informing users such classifiers +are deprecated, and recommending ``License-Expression`` instead. +For compatibility with existing publishing and installation processes, +the presence of license classifiers SHOULD NOT raise an error unless +``License-Expression`` is also provided. - License: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) +For all newly-uploaded distributions that include a +``License-Expression`` field, the `Python Package Index (PyPI) <#pypi_>`_ MUST +reject any that also specify any license classifiers. - License: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause +New license classifiers MUST NOT be `added to PyPI <#classifiersrepo_>`_; +users needing them SHOULD use the ``License-Expression`` field instead. +Along with the ``License`` field, license classifiers may be removed from a +new version of the specification in a future PEP. - License: This software may only be obtained by sending the - author a postcard, and then the user promises not - to redistribute it. - License: LicenseRef-Proprietary AND LicenseRef-Public-Domain +Project source metadata +----------------------- +As originally introduced in PEP 621, the +`PyPA Declaring Project Metadata specification <#pep621spec_>`_ +defines how to declare a project's source +metadata in a ``[project]`` table in the ``pyproject.toml`` file for +build tools to consume and output distribution core metadata. -Classifier (multiple use) -::::::::::::::::::::::::: +This PEP `adds `_ the ``license-expression`` key, +`adds `_ the ``license-files`` key and +`deprecates `_ the ``license`` key. -Each entry is a string giving a single classification value for the -distribution. Classifiers are described in PEP 301. -Examples:: +Add ``license-expression`` key +'''''''''''''''''''''''''''''' - Classifier: Development Status :: 4 - Beta - Classifier: Environment :: Console (Text Based) +A new ``license-expression`` key is added to the ``project`` table, which has +a string value that is a valid SPDX license expression, as +`defined previously `_. +Its value maps to the ``License-Expression`` field in the core metadata. -Tools SHOULD issue an informational warning if this field contains a licensing- -related classifier string starting with the ``License ::`` prefix and SHOULD -suggest the use of a license expression in the ``License`` field instead. +Build tools SHOULD validate the expression as described +`above `_, outputting +an error or warning as specified. When generating the core metadata, tools +MUST perform case normalization. -If the ``License`` field is present and contains a valid license expression, -publishing tools MUST NOT also provide any licensing-related classifier entries -[#classif]_. +If and only if the ``license-expression`` key is listed as ``dynamic`` +(and is not specified), tools MAY infer a value for the ``License-Expression`` +field if they can do so unambiguously, but MUST follow the provisions in the +`Converting legacy metadata`_ section. -However, for compatibility with existing publishing and installation processes, -licensing-related classifier entries SHOULD continue to be accepted if the -``License`` field is absent or does not contain a valid license expression. +If the ``license-expression`` key is present and valid (and the ``license`` +key is not specified), for purposes of backward compatibility, tools MAY +back-fill the ``License`` core metadata field with the case-normalized value +of the ``license-expression`` key. -Publishing tools MAY infer a license expression from the provided classifier -entries if they are able to do so unambiguously. -However, no new licensing related classifiers will be added; anyone -requesting them will be directed to use a license expression in the ``License`` -field instead. Note that the licensing-related classifiers may be deprecated in -a future PEP. +Add ``license-files`` key +''''''''''''''''''''''''' +A new ``license-files`` key is added to the ``project`` table for specifying +paths in the project source relative to ``pyproject.toml`` to file(s) +containing licenses and other legal notices to be distributed with the package. +It corresponds to the ``License-File`` fields in the core metadata. -Mapping Legacy Classifiers to New License Expressions -''''''''''''''''''''''''''''''''''''''''''''''''''''' +Its value is a table, which if present MUST contain one of two optional, +mutually exclusive subkeys, ``paths`` and ``globs``; if both are specified, +tools MUST raise an error. Both are arrays of strings; the ``paths`` subkey +contains verbatim file paths, and the ``globs`` subkey valid glob patterns, +which MUST be parsable by the ``glob`` `module <#globmodule_>`_ in the +Python standard library. -Publishing tools MAY infer or suggest an equivalent license expression from the -provided ``License`` or ``Classifier`` information if they are able to do so -unambiguously. For instance, if a package only has this license classifier:: +**Note**: To avoid ambiguity, confusion and (per PEP 20, the Zen of Python) +"more than one (obvious) way to do it", allowing a flat array of strings +as the value for the ``license-files`` key has been +`left out for now `_. - Classifier: License :: OSI Approved :: MIT License +Path separators, if used, MUST be the forward slash character (``/``), +and parent directory indicators (``..``) MUST NOT be used. +Tools MUST assume that license file content is valid UTF-8 encoded text, +and SHOULD validate this and raise an error if it is not. -Then the corresponding value for a ``License`` field using a valid license -expression to suggest would be:: +If the ``paths`` subkey is a non-empty array, build tools: - License: MIT +- MUST treat each value as a verbatim, literal file path, and + MUST NOT treat them as glob patterns. +- MUST include each listed file in all distribution archives. -Here are mapping guidelines for the legacy classifiers: +- MUST NOT match any additional license files beyond those explicitly + statically specified by the user under the ``paths`` subkey. -- Classifier ``License :: Other/Proprietary License`` becomes License: - ``LicenseRef-Proprietary`` expression. +- MUST list each file path under a ``License-File`` field in the core metadata. -- Classifier ``License :: Public Domain`` becomes License: ``LicenseRef-Public-Domain`` - expression, though tools should encourage the use of more explicit and legally - portable license identifiers such as ``CC0-1.0`` [#cc0]_, the ``Unlicense`` - [#unlic]_ since the meaning associated with the term "public domain" is thoroughly - dependent on the specific legal jurisdiction involved and some jurisdictions - have no concept of Public Domain as it exists in the USA. +- MUST raise an error if one or more paths do not correspond to a valid file + in the project source that can be copied into the distribution archive. -- The generic and ambiguous classifiers ``License :: OSI Approved`` and - ``License :: DFSG approved`` do not have an equivalent license expression. +If the ``globs`` subkey is a non-empty array, build tools: + +- MUST treat each value as a glob pattern, and MUST raise an error if the + pattern contains invalid glob syntax. + +- MUST include all files matched by at least one listed pattern in all + distribution archives. + +- MAY exclude files matched by glob patterns that can be unambiguously + determined to be backup, temporary, hidden, OS-generated or VCS-ignored. + +- MUST list each matched file path under a ``License-File`` field in the + core metadata. + +- SHOULD issue a warning and MAY raise an error if no files are matched. + +- MAY issue a warning if any individual user-specified pattern + does not match at least one file. + +If the ``license-files`` key is present, and the ``paths`` or ``globs`` subkey +is set to a value of an empty array, then tools MUST NOT include any +license files and MUST NOT raise an error. + +If the ``license-files`` key is not present and not explicitly marked as +``dynamic``, tools MUST assume a default value of the following:: + + license-files.globs = ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"] + +In this case, tools MAY issue a warning if no license files are matched, +but MUST NOT raise an error. + +If the ``license-files`` key is marked as ``dynamic`` (and not present), +to preserve consistent behavior with current tools and help ensure the packages +they create are legally distributable, build tools SHOULD default to +including at least the license files matching the above patterns, unless the +user has explicitly specified their own. + + +Deprecate ``license`` key +''''''''''''''''''''''''' + +The ``license`` key in the ``project`` table is now deprecated. +It MUST NOT be used or listed as ``dynamic`` if either of the new +``license-expression`` or ``license-files`` keys are defined, +and build tools MUST raise an error if either is the case. + +Otherwise, if the ``text`` subkey is present in the ``license`` table, tools +SHOULD issue a warning informing users it is deprecated and recommending the +``license-expression`` key instead. + +Likewise, if the ``file`` subkey is present in the ``license`` table, tools +SHOULD issue a warning informing users it is deprecated and recommending +the ``license-files`` key instead. However, if the file is present in the +source, build tools SHOULD still use it to fill the ``License-File`` field +in the core metadata, and if so, MUST include the specified file in any +distribution archives for the project. If the file does not exist at the +specified path, tools SHOULD issue a warning, and MUST NOT fill it in a +``License-File`` field. + +For backwards compatibility, to preserve consistent behavior with current tools +and ensure that users do not unknowingly create packages that are not legally +distributable, tools MUST assume the above default value for the +``license-files`` key and also include, in addition to the license file +specified under this ``file`` subkey, any license files that match the +specified list of patterns. + +The ``license`` key may be removed from a new version of the specification +in a future PEP. + + +License files in project formats +-------------------------------- + +A few minor additions will be made to the relevant existing specifications +to document, standardize and clarify what is already currently supported, +allowed and implemented behavior, as well as explicitly mention the root +license directory the license files are located in and relative to for +each format, per the `specification above `_. + +**Project source trees** + As `described above `_, the + `Declaring Project Metadata specification <#pep621spec_>`_ + will be updated to reflect that license file paths MUST be relative to the + project root directory; i.e. the directory containing the ``pyproject.toml`` + (or equivalently, other legacy project configuration, + e.g. ``setup.py``, ``setup.cfg``, etc). + +**Source distributions** *(sdists)* + The `sdist specification <#sdistspec_>`_ will be updated to reflect that for + ``Metadata-Version`` is ``2.3`` or greater, the sdist MUST contain any + license files specified by ``License-File`` in the ``PKG-INFO`` at their + respective paths relative to the top-level directory of the sdist + (containing the ``pyproject.toml`` and the ``PKG-INFO`` core metadata). + +**Built distributions** *(wheels)* + The `wheel specification <#wheelspec_>`_ will be updated to reflect that if + the ``Metadata-Version`` is ``2.3`` or greater and one or more + ``License-File`` fields is specified, the ``.dist-info`` directory MUST + contain a ``license_files`` subdirectory which MUST contain the files listed + in the ``License-File`` fields in the ``METADATA`` file at their respective + paths relative to the ``license_files`` directory. + +**Installed projects** + The `Recording Installed Projects specification <#installedspec_>`_ will be + updated to reflect that if the ``Metadata-Version`` is ``2.3`` or greater + and one or more ``License-File`` fields is specified, the ``.dist-info`` + directory MUST contain a ``license_files`` subdirectory which MUST contain + the files listed in the ``License-File`` fields in the ``METADATA`` file + at their respective paths relative to the ``license_files`` directory, + and that any files in this directory MUST be copied from wheels + by install tools. + + +Converting legacy metadata +-------------------------- + +If the contents of the ``license.text`` PEP 621 source metadata key +(or equivalent for tool-specific config formats) is a valid license expression +containing solely known, non-deprecated license identifiers, and, if +PEP 621 metadata are defined, the ``license-expression`` key is listed as +``dynamic``, build tools MAY use it to fill the ``License-Expression`` field. + +Similarly, if the ``classifiers`` PEP 621 source metadata key (or equivalent +for tool-specific config formats) contains exactly one license classifier +that unambiguously maps to exactly one valid, non-deprecated SPDX license +identifier, tools MAY fill the ``License-Expression`` field with the latter. + +If both a ``license.text`` or equivalent value and a single license classifier +are present, the contents of the former, including capitalization +(but excluding leading and trailing whitespace), MUST exactly match the SPDX +license identifier mapped to the license classifier to be considered +unambiguous for the purposes of automatically filling the +``License-Expression`` field. + +If tools have filled the ``License-Expression`` field as described here, +they MUST output a prominent, user-visible warning informing package authors +of that fact, including the ``License-Expression`` string they have output, +and recommending that the project source metadata be updated accordingly +with the indicated license expression. + +In any other case, tools MUST NOT use the contents of the ``license.text`` +key (or equivalent) or license classifiers to fill the +``License-Expression`` field without informing the user and requiring +unambiguous, affirmative user action to select and confirm the desired +``License-Expression`` value before proceeding. + + +Mapping license classifiers to SPDX identifiers +''''''''''''''''''''''''''''''''''''''''''''''' + +Most single license classifiers (namely, all those not mentioned below) +map to a single valid SPDX license identifier, allowing tools to insert them +into the ``License-Expression`` field following the +`specification above `_. + +Many legacy license classifiers intend to specify a particular license, +but do not specify the particular version or variant, leading to a +`critical ambiguity <#classifierissue_>`_ as to their terms, compatibility +and acceptability. Tools MUST NOT attempt to automatically infer a +``License-Expression`` when one of these classifiers is used, and SHOULD +instead prompt the user to affirmatively select and confirm their intended +license choice. + +These classifiers are the following: + +- ``License :: OSI Approved :: Academic Free License (AFL)`` +- ``License :: OSI Approved :: Apache Software License`` +- ``License :: OSI Approved :: Apple Public Source License`` +- ``License :: OSI Approved :: Artistic License`` +- ``License :: OSI Approved :: BSD License`` +- ``License :: OSI Approved :: GNU Affero General Public License v3`` +- ``License :: OSI Approved :: GNU Free Documentation License (FDL)`` +- ``License :: OSI Approved :: GNU General Public License (GPL)`` +- ``License :: OSI Approved :: GNU General Public License v2 (GPLv2)`` +- ``License :: OSI Approved :: GNU General Public License v3 (GPLv3)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v2 (LGPLv2)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v2 or later (LGPLv2+)`` +- ``License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)`` +- ``License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)`` + +A comprehensive mapping of these classifiers to their possible specific +identifiers was `assembled by Dustin Ingram <#badclassifiers_>`_, which tools +MAY use as a reference for the identifier selection options to offer users +when prompting the user to explicitly select the license identifier +they intended for their project. + +**Note**: Several additional classifiers, namely the "or later" variants of +the AGPLv3, GPLv2, GPLv3 and LGPLv3, are also listed in the aforementioned +mapping, but as they were merely proposed for textual harmonization and +still unambiguously map to their respective licenses, +they were not included here; LGPLv2 is, however, as it could ambiguously +refer to either the distinct v2.0 or v2.1 variants of that license. + +In addition, for the various special cases, the following mappings are +considered canonical and normative for the purposes of this specification: + +- Classifier ``License :: Public Domain`` MAY be mapped to the generic + ``License-Expression: LicenseRef-Public-Domain``. + If tools do so, they SHOULD issue an informational warning encouraging + the use of more explicit and legally portable license identifiers, + such as those for the `CC0 1.0 license <#cc0_>`_ (``CC0-1.0``), + the `Unlicense <#unlicense_>`_ (``Unlicense``), + or the `MIT license <#mitlicense_>`_ (``MIT``), + since the meaning associated with the term "public domain" is thoroughly + dependent on the specific legal jurisdiction involved, + some of which lack the concept entirely. + Alternatively, tools MAY choose to treat these classifiers as ambiguous and + require user confirmation to fill ``License-Expression`` in these cases. - The generic and sometimes ambiguous classifiers - ``License :: Free For Educational Use``, ``License :: Free For Home Use``, - ``License :: Free for non-commercial use``, ``License :: Freely Distributable``, - ``License :: Free To Use But Restricted``, and ``License :: Freeware`` are mapped - to the generic License: ``LicenseRef-Proprietary`` expression. + ``License :: Free For Educational Use``, + ``License :: Free For Home Use``, + ``License :: Free for non-commercial use``, + ``License :: Freely Distributable``, + ``License :: Free To Use But Restricted``, + ``License :: Freeware``, and + ``License :: Other/Proprietary License`` MAY be mapped to the generic + ``License-Expression: LicenseRef-Proprietary``, + but tools MUST issue a prominent, informative warning if they do so. + Alternatively, tools MAY choose to treat these classifiers as ambiguous and + require user confirmation to fill ``License-Expression`` in these cases. -- Classifiers ``License :: GUST*`` have no mapping to SPDX license identifierss - for now and no package uses them in PyPI as of the writing of this PEP. +- The generic and ambiguous classifiers ``License :: OSI Approved`` and + ``License :: DFSG approved`` do not map to any license expression, + and thus tools MUST treat them as ambiguous and require user intervention + to fill ``License-Expression``. -The remainder of the classifiers using a ``License ::`` prefix map to a simple -single-identifier license expression using the corresponding SPDX license identifiers. +- The classifiers ``License :: GUST Font License 1.0`` and + ``License :: GUST Font License 2006-09-30`` have no mapping to SPDX license + identifiers and no PyPI package uses them, as of the writing of this PEP. + Therefore, tools MUST treat them as ambiguous when attempting to fill + ``License-Expression``. -When multiple license-related classifiers are used, their relation is ambiguous +When multiple license classifiers are used, their relationship is ambiguous, and it is typically not possible to determine if all the licenses apply or if there is a choice that is possible among the licenses. In this case, tools -cannot reliably infer a license expression and should suggest that the package -author construct a license expression which expresses their intent. +MUST NOT automatically infer a license expression, and SHOULD suggest that the +package author construct one which expresses their intent. -Summary of Differences From PEP 566 -=================================== +User Scenarios +============== -* Metadata-Version is now 2.2. -* Added one new field: ``License-File`` -* Updated the documentation of two fields: ``License`` and ``Classifier`` +The following covers the range of common use cases from a user perspective, +providing straightforward guidance for each. Do note that the following +should **not** be considered legal advice, and readers should consult a +licensed attorney if they are unsure about the specifics for their situation. + + +I have a private package that won't be distributed +-------------------------------------------------- + +If your package isn't shared publicly, i.e. outside your company, +organization or household, it *usually* isn't strictly necessary to include +a formal license, so you wouldn't necessarily have to do anything extra here. + +However, it is still a good idea to include ``LicenseRef-Proprietary`` +as a license expression in your package configuration, and/or a +copyright statement and any legal notices in a ``LICENSE.txt`` file +in the root of your project directory, which will be automatically +included by packaging tools. + + +I just want to share my own work without legal restrictions +----------------------------------------------------------- + +While you aren't required to include a license, if you don't, no one has +`any permission to download, use or improve your work <#dontchoosealicense_>`_, +so that's probably the *opposite* of what you actually want. +The `MIT license <#mitlicense_>`_ is a great choice instead, as it's simple, +widely used and allows anyone to do whatever they want with your work +(other than sue you, which you probably also don't want). + +To apply it, just paste `the text <#chooseamitlicense_>`_ into a file named +``LICENSE.txt`` at the root of your repo, and add the year and your name to +the copyright line. Then, just add ``license-expression = "MIT"`` under +``[project]`` in your ``pyproject.toml`` if your packaging tool supports it, +or in its config file/section (e.g. Setuptools ``license_expression = MIT`` +under ``[metadata]`` in ``setup.cfg``). You're done! + + +I want to distribute my project under a specific license +-------------------------------------------------------- + +To use a particular license, simply paste its text into a ``LICENSE.txt`` +file at the root of your repo, if you don't have it in a file starting with +``LICENSE`` or ``COPYING`` already, and add +``license-expression = "LICENSE-ID"`` under ``[project]`` in your +``pyproject.toml`` if your packaging tool supports it, or else in its +config file (e.g. for Setuptools, ``license_expression = LICENSE-ID`` +under ``[metadata]`` in ``setup.cfg``). You can find the ``LICENSE-ID`` +and copyable license text on sites like +`ChooseALicense <#choosealicenselist_>`_ or `SPDX <#spdxlist_>`_. + +Many popular code hosts, project templates and packaging tools can add the +license file for you, and may support the expression as well in the future. + + +I maintain an existing package that's already licensed +------------------------------------------------------ + +If you already have license files and metadata in your project, you +should only need to make a couple of tweaks to take advantage of the new +functionality. + +In your project config file, enter your license expression under +``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` +(Setuptools ``setup.cfg`` / ``setup.py``), or the equivalent for your +packaging tool, and make sure to remove any legacy ``license`` value or +``License ::`` classifiers. Your existing ``license`` value may already +be valid as one (e.g. ``MIT``, ``Apache-2.0 OR BSD-2-Clause``, etc); +otherwise, check the `SPDX license list <#spdxlist_>`_ for the identifier +that matches the license used in your project. + +If your license files begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or +``AUTHORS``, or you've already configured your packaging tool to add them +(e.g. ``license_files`` in ``setup.cfg``), you should already be good to go. +If not, make sure to list them under ``license-files.paths`` +or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` +(if your tool supports it), or else in your tool's configuration file +(e.g. ``license_files`` in ``setup.cfg`` for Setuptools). + +See the `basic example`_ for a simple but complete real-world demo of how +this works in practice, including some additional technical details. +Packaging tools may support automatically converting legacy licensing +metadata; check your tool's documentation for more information. + + +My package includes other code under different licenses +------------------------------------------------------- + +If your project includes code from others covered by different licenses, +such as vendored dependencies or files copied from other open source +software, you can construct a license expression (or have a tool +help you do so) to describe the licenses involved and the relationship +between them. + +In short, ``License-1 AND License-2`` mean that *both* licenses apply +to your project, or parts of it (for example, you included a file +under another license), and ``License-1 OR License-2`` means that +*either* of the licenses can be used, at the user's option (for example, +you want to allow users a choice of multiple licenses). You can use +parenthesis (``()``) for grouping to form expressions that cover even the most +complex situations. + +In your project config file, enter your license expression under +``license-expression`` (PEP 621 ``pyproject.toml``), ``license_expression`` +(Setuptools ``setup.cfg`` / ``setup.py``), or the equivalent for your +packaging tool, and make sure to remove any legacy ``license`` value or +``License ::`` classifiers. + +Also, make sure you add the full license text of all the licenses as files +somewhere in your project repository. If all of them are in the root directory +and begin with ``LICENSE``, ``COPYING``, ``NOTICE`` or ``AUTHORS``, +they will be included automatically. Otherwise, you'll need to list the +relative path or glob patterns to each of them under ``license-files.paths`` +or ``license-files.globs`` under ``[project]`` in ``pyproject.toml`` +(if your tool supports it), or else in your tool's configuration file +(e.g. ``license_files`` in ``setup.cfg`` for Setuptools). + +As an example, if your project was licensed MIT but incorporated +a vendored dependency (say, ``packaging``) that was licensed under +either Apache 2.0 or the 2-clause BSD, your license expression would +be ``MIT AND (Apache-2.0 OR BSD-2-Clause)``. You might have a +``LICENSE.txt`` in your repo root, and a ``LICENSE-APACHE.txt`` and +``LICENSE-BSD.txt`` in the ``_vendor`` subdirectory, so to include +all of them, you'd specify ``["LICENSE.txt", "_vendor/packaging/LICENSE*"]`` +as glob patterns, or +``["LICENSE.txt", "_vendor/LICENSE-APACHE.txt", "_vendor/LICENSE-BSD.txt"]`` +as literal file paths. + +See a fully worked out `advanced example`_ for a comprehensive end-to-end +application of this to a real-world complex project, with copious technical +details, and consult a `tutorial <#spdxtutorial_>`_ for more help and examples +using SPDX identifiers and expressions. Backwards Compatibility ======================= -The reuse of the ``License`` field means that we keep backward -compatibility. The specification of the ``License-File`` field is only writing -down the practices of the ``wheel`` and ``setuptools`` tools and is backward -compatible with their support for that field. - -The "soft" validation of the ``License`` field when it does not contain a valid -license expression and when the ``Classifier`` field is used with legacy -license-related classifiers means that we can gently prepare users for possible -strict and incompatible validation of these fields in the future. +Adding a new, dedicated ``License-Expression`` core metadata field and +``license-expression`` PEP 621 source metadata key unambiguously signals +support for the specification in this PEP. This avoids the risk of new tooling +misinterpreting a license expression as a free-form license description +or vice versa, and raises an error if and only if the user affirmatively +upgrades to the latest metadata version and adds the new field/key. + +The legacy ``License`` core metadata field and ``license`` PEP 621 source +metadata key will be deprecated along with the license classifiers, +retaining backwards compatibility while gently preparing users for their +future removal. Such a removal would follow a suitable transition period, and +be left to a future PEP and a new version of the core metadata specification. + +Formally specifying the new ``License-File`` core metadata field and the +inclusion of the listed files in the distribution merely codifies and +refines the existing practices in popular packaging tools, including the Wheel +and Setuptools projects, and is designed to be largely backwards-compatible +with their existing use of that field. Likewise, the new ``license-files`` +PEP 621 source metadata key standardizes statically specifying the files +to include, as well as the default behavior, and allows other tools to +make use of them, while only having an effect once users and tools expressly +adopt it. + +Due to requiring license files not be flattened into ``.dist-info`` and +specifying that they should be placed in a dedicated ``license_files`` subdir, +wheels produced following this change will have differently-located +licenses relative to those produced via the previous unspecified, +installer-specific behavior, but as until this PEP there was no way of +discovering these files or accessing them programmatically, and this will +be further discriminated by a new metadata version, there aren't any foreseen +mechanism for this to pose a practical issue. + +Furthermore, this resolves existing compatibility issues with the current +ad hoc behavior, namely license files being silently clobbered if they have +the same names as others at different paths, unknowingly rendering the wheel +undistributable, and conflicting with the names of other metadata files in +the same directory. Formally specifying otherwise would in fact block full +forward compatibility with additional standard or installer-specified files +and directories added to ``.dist-info``, as they too could conflict with +the names of existing licenses. + +While minor additions will be made to the source distribution (sdist), +built distribution (wheel) and installed project specifications, all of these +are merely documenting, clarifying and formally specifying behaviors explicitly +allowed under their current respective specifications, and already implemented +in practice, and gating them behind the explicit presence of both the new +metadata versions and the new fields. In particular, sdists may contain +arbitrary files following the project source tree layout, and formally +mentioning that these must include the license files listed in the metadata +merely documents and codifies existing Setuptools practice. Likewise, arbitrary +installer-specific files are allowed in the ``.dist-info`` directory of wheels +and copied to installed projects, and again this PEP just formally clarifies +and standardizes what is already being done. + +Finally, while this PEP does propose PyPI implement validation of the new +``License-Expression`` and ``License-File`` fields, this has no effect on +existing packages, nor any effect on any new distributions uploaded unless they +explicitly choose to opt in to using these new fields while not +following the requirements in the specification. Therefore, this does not have +a backward compatibility impact, and in fact ensures forward compatibility with +any future changes by ensuring all distributions uploaded to PyPI with the new +fields are valid and conform to the specification. Security Implications ===================== -This PEP has no foreseen security implications: the License field is a plain -string and the License-File(s) are file paths. None of them introduces any new -security concern. +This PEP has no foreseen security implications: the ``License-Expression`` +field is a plain string and the ``License-File`` fields are file paths. +Neither introduces any known new security concerns. -How to Teach Users to Use License Expressions -============================================= +How to Teach This +================= The simple cases are simple: a single license identifier is a valid license -expression and a large majority of packages use a single license. +expression, and a large majority of packages use a single license. The plan to teach users of packaging tools how to express their package's license with a valid license expression is to have tools issue informative -messages when they detect invalid license expressions or when a license-related -classifier is used in the ``Classifier`` field. - -With a warning message that does not terminate processing, publishing tools will -gently teach users how to provide correct license expressions over time. - -Tools may also help with the conversion and suggest a license expression in some -cases: - -1. The section `Mapping Legacy Classifiers to New License expressions`_ provides - tool authors with guidelines on how to suggest a license expression produced - from legacy classifiers. - -2. Tools may also be able to infer and suggest how to update an existing - incorrect ``License`` value and convert that to a correct license expression. - For instance a tool may suggest to correct a ``License`` field from - ``Apache2`` (which is not a valid license expression as defined in this PEP) - to ``Apache-2.0`` (which is a valid license expression using an SPDX license - identifier as defined in this PEP). +messages when they detect invalid license expressions, or when the deprecated +``License`` field or license classifiers are used. + +An immediate, descriptive error message if an invalid ``License-Expression`` +is used will help users understand they need to use SPDX identifiers in +this field, and catch them if they make a mistake. +For authors still using the now-deprecated, less precise and more redundant +``License`` field or license classifiers, packaging tools will warn +them and inform them of the modern replacement, ``License-Expression``. +Finally, for users who may have forgotten or not be aware they need to do so, +publishing tools will gently guide them toward including ``license-expression`` +and ``license-files`` in their project source metadata. + +Tools may also help with the conversion and suggest a license expression in +many, if not most common cases: + +- The section `Mapping license classifiers to SPDX identifiers`_ provides + tool authors with guidelines on how to suggest a license expression produced + from legacy classifiers. + +- Tools may also be able to infer and suggest how to update an existing + ``License`` value and convert that to a ``License-Expression``. + For instance, a tool may suggest converting from a ``License`` field with + ``Apache2`` (which is not a valid license expression as defined in this PEP) + to a ``License-Expression`` field with ``Apache-2.0`` (which is a valid + license expression using an SPDX license identifier). Reference Implementation ======================== Tools will need to support parsing and validating license expressions in the -``License`` field. +``License-Expression`` field. -The ``license-expression`` library [#licexp]_ is a reference Python -implementation of a library that handles license expressions including parsing, -validating and formatting license expressions using flexible lists of license -symbols (including SPDX license identifiers and any extra identifiers referenced -here). It is licensed under the Apache-2.0 license and is used in a few projects -such as the SPDX Python tools [#spdxpy]_, the ScanCode toolkit [#scancodetk]_ -and the Free Software Foundation Europe (FSFE) Reuse project [#reuse]_. +The `license-expression library <#licenseexplib_>`_ is a reference Python +implementation that handles license expressions including parsing, +formatting and validation, using flexible lists of license symbols +(including SPDX license IDs and any extra identifiers included here). +It is licensed under Apache-2.0 and is already used in several projects, +including the `SPDX Python Tools <#spdxpy_>`_, +the `ScanCode toolkit <#scancodetk_>`_ +and the Free Software Foundation Europe (FSFE) `REUSE project <#reuse_>`_. -Rejected ideas +Rejected Ideas ============== -1. Use a new ``License-Expression`` field and deprecate the ``License`` field. - -Adding a new field would introduce backward incompatible changes when the -``License`` field would be retired later and require having more complex -validation. The use of such a field would further introduce a new concept that -is not seen anywhere else in any other package metadata (e.g. a new field only -for license expression) and possibly be a source of confusion. Also, users are -less likely to start using a new field than make small adjustments to their use -of existing fields. - - -2. Mapping licenses used in the license expression to specific files in the - license files (or vice versa). +Core metadata fields +-------------------- -This would require using a mapping (two parallel lists would be too prone to -alignment errors) and a mapping would bring extra complication to how license -are documented by adding an additional nesting level. +Potential alternatives to the structure, content and deprecation of the +core metadata fields specified in this PEP. + + +Re-use the ``License`` field +'''''''''''''''''''''''''''' + +Following `initial discussion <#reusediscussion_>`_, earlier versions of this +PEP proposed re-using the existing ``License`` field, which tools would +attempt to parse as a SPDX license expression with a fallback to free text. +Initially, this would merely cause a warning (or even pass silently), +but would eventually be treated as an error by modern tooling. + +This offered the potential benefit of greater backwards-compatibility, +easing the community into using SPDX license expressions while taking advantage +of packages that already have them (either intentionally or coincidentally), +and avoided adding yet another license-related field. + +However, following substantial discussion, consensus was reached that a +dedicated ``License-Expression`` field was the preferred overall approach. +The presence of this field is an unambiguous signal that a package +intends it to be interpreted as a valid SPDX identifier, without the need +for complex and potentially erroneous heuristics, and allows tools to +easily and unambiguously detect invalid content. + +This avoids both false positive (``License`` values that a package author +didn't explicitly intend as an explicit SPDX identifier, but that happen +to validate as one), and false negatives (expressions the author intended +to be valid SPDX, but due to a typo or mistake are not), which are otherwise +not clearly distinguishable from true positives and negatives, an ambiguity +at odds with the goals of this PEP. + +Furthermore, it allows both the existing ``License`` field and +the license classifiers to be more easily deprecated, +with tools able to cleanly distinguish between packages intending to +affirmatively conform to the updated specification in this PEP or not, +and adapt their behavior (warnings, errors, etc) accordingly. +Otherwise, tools would either have to allow duplicative and potentially +conflicting ``License`` fields and classifiers, or warn/error on the +substantial number of existing packages that have SPDX identifiers as the +value for the ``License`` field, intentionally or otherwise (e.g. ``MIT``). + +Finally, it avoids changing the behavior of an existing metadata field, +and avoids tools having to guess the ``Metadata-Version`` and field behavior +based on its value rather than merely its presence. + +While this would mean the subset of existing distributions containing +``License`` fields valid as SPDX license expressions wouldn't automatically be +recognized as such, this only requires appending a few characters to the key +name in the project's source metadata, and this PEP provides extensive +guidance on how this can be done automatically by tooling. + +Given all this, it was decided to proceed with defining a new, +purpose-created field, ``License-Expression``. + + +Re-Use the ``License`` field with a value prefix +'''''''''''''''''''''''''''''''''''''''''''''''' + +As an alternative to the above, prefixing SPDX license expressions with, +e.g. ``spdx:`` was suggested to reduce the ambiguity inherent in re-using +the ``License`` field. However, this effectively amounted to creating +a field within a field, and doesn't address all the downsides of +keeping the ``License`` field. Namely, it still changes the behavior of an +existing metadata field, requires tools to parse its value +to determine how to handle its content, and makes the specification and +deprecation process more complex and less clean. + +Yet, it still shares a same main potential downside as just creating a new +field: projects currently using valid SPDX identifiers in the ``License`` +field, intentionally or not, won't be automatically recognized, and requires +about the same amount of effort to fix, namely changing a line in the +project's source metadata. Therefore, it was rejected in favor of a new field. + + +Don't make ``License-Expression`` mutually exclusive +'''''''''''''''''''''''''''''''''''''''''''''''''''' + +For backwards compatibility, the ``License`` field and/or the license +classifiers could still be allowed together with the new +``License-Expression`` field, presumably with a warning. However, this +could easily lead to inconsistent, and at the very least duplicative +license metadata in no less than *three* different fields, which is +squarely contrary to the goals of this PEP of making the licensing story +simpler and unambiguous. Therefore, and in concert with clear community +consensus otherwise, this idea was soundly rejected. + + +Don't deprecate existing ``License`` field and classifiers +'''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +Several community members were initially concerned that deprecating the +existing ``License`` field and classifiers would result in +excessive churn for existing package authors and raise the barrier to +entry for new ones, particularly everyday Python developers seeking to +package and publish their personal projects without necessarily caring +too much about the legal technicalities or being a "license lawyer". +Indeed, every deprecation comes with some non-zero short-term cost, +and should be carefully considered relative to the overall long-term +net benefit. And at the minimum, this change shouldn't make it more +difficult for the average Python developer to share their work under +a license of their choice, and ideally improve the situation. + +Following many rounds of proposals, discussion and refinement, +the general consensus was clearly in favor of deprecating the legacy +means of specifying a license, in favor of "one obvious way to do it", +to improve the currently complex and fragmented story around license +documentation. Not doing so would leave three different un-deprecated ways of +specifying a license for a package, two of them ambiguous, less than +clear/obvious how to use, inconsistently documented and out of date. +This is more complex for all tools in the ecosystem to support +indefinitely (rather than simply installers supporting older packages +implementing previous frozen metadata versions), resulting in a non-trivial +and unbounded maintenance cost. + +Furthermore, it leads to a more complex and confusing landscape for users with +three similar but distinct options to choose from, particularly with older +documentation, answers and articles floating around suggesting different ones. +Of the three, ``License-Expression`` is the simplest and clearest to use +correctly; users just paste in their desired license identifier, or select it +via a tool, and they're done; no need to learn about Trove classifiers and +dig through the list to figure out which one(s) apply (and be confused +by many ambiguous options), or figure out on their own what should go +in the ``license`` key (anything from nothing, to the license text, +to a free-form description, to the same SPDX identifier they would be +entering in the ``license-expression`` key anyway, assuming they can +easily find documentation at all about it). In fact, this can be +made even easier thanks to the new field. For example, GitHub's popular +`ChooseALicense.com <#choosealicense_>`_ links to how to add SPDX license +identifiers to the project source metadata of various languages that support +them right in the sidebar of every license page; the SPDX support in this +PEP enables adding Python to that list. + +For current package maintainers who have specified a ``License`` or license +classifiers, this PEP only recommends warnings and prohibits errors for +all but publishing tools, which are allowed to error if their intended +distribution platform(s) so requires. Once maintainers are ready to +upgrade, for those already using SPDX license expressions (accidentally or not) +this only requires appending a few characters to the key name in the +project's source metadata, and for those with license classifiers that +map to a single unambiguous license, or another defined case (public domain, +proprietary), they merely need to drop the classifier and paste in the +corresponding license identifier. This PEP provides extensive guidance and +examples, as will other resources, as well as explicit instructions for +automated tooling to take care of this with no human changes needed. +More complex cases where license metadata is currently specified may +need a bit of human intervention, but in most cases tools will be able +to provide a list of options following the mappings in this PEP, and +these are typically the projects most likely to be constrained by the +limitations of the existing license metadata, and thus most benefited +by the new fields in this PEP. + +Finally, for unmaintained packages, those using tools supporting older +metadata versions, or those who choose not to provide license metadata, +no changes are required regardless of the deprecation. + + +Don't mandate validating new fields on PyPI +''''''''''''''''''''''''''''''''''''''''''' + +Previously, while this PEP did include normative guidelines for packaging +publishing tools (such as Twine), it did not provide specific guidance +for PyPI (or other package indicies) as to whether and how they +should validate the ``License-Expression`` or ``License-File`` fields, +nor how they should handle using them in combination with the deprecated +``License`` field or license classifiers. This simplifies the specification +and either defers implementation on PyPI to a later PEP, or gives +discretion to PyPI to enforce the stated invariants, to minimize +disruption to package authors. + +However, this had been left unstated from before the ``License-Expression`` +field was separate from the existing ``License``, which would make +validation much more challenging and backwards-incompatible, breaking +existing packages. With that change, there was a clear consensus that +the new field should be validated from the start, guaranteeing that all +distributions uploaded to PyPI that declare core metadata version 2.3 +or higher and have the ``License-Expression`` field will have a valid +expression, such that PyPI and consumers of its packages and metadata +can rely upon to follow the specification here. + +The same can be extended to the new ``License-File`` field as well, +to ensure that it is valid and the legally required license files are +present, and thus it is lawful for PyPI, users and downstream consumers +to distribute the package. (Of course, this makes no *guarantee* of such +as it is ultimately reliant on authors to declare them, but it improves +assurance of this and allows doing so in the future if the community so +decides.) To be clear, this would not require that any uploaded distribution +have such metadata, only that if they choose to declare it per the new +specification in this PEP, it is assured to be valid. + + +Source metadata ``license`` key +------------------------------- + +Alternate possibilities related to the ``license`` key in the +``pyproject.toml`` project source metadata specified in PEP 621. + + +Add ``expression`` and ``files`` subkeys to table +''''''''''''''''''''''''''''''''''''''''''''''''' + +A previous working draft of this PEP added ``expression`` and ``files`` subkeys +to the existing ``license`` table in the PEP 621 source metadata, to parallel +the existing ``file`` and ``text`` subkeys. While this seemed perhaps the +most obvious approach at first glance, it had several serious drawbacks +relative to that ultimately taken here. + +Most saliently, this means two very different types of metadata are being +specified under the same top-level key that require very different handling, +and furthermore, unlike the previous arrangement, the subkeys were not mutually +exclusive and can both be specified at once, and with some subkeys potentially +being dynamic and others static, and mapping to different core metadata fields. +This also breaks from the consensus for the core metadata fields, namely to +separate the license expression into its own explicit field. + +Furthermore, this leads to a conflict with marking the key as ``dynamic`` +(assuming that is intended to specify PEP 621 keys, as that PEP seems to rather +imprecisely imply, rather than core metadata fields), as either both would have +to be treated as ``dynamic``. A user may want to specify the ``expression`` +key as ``dynamic``, if they intend their tooling to generate it automatically; +conversely, they may rely on their build tool to dynamically detect license +files via means outside of that strictly specified here. And indeed, current +users may mark the present ``license`` key as ``dynamic`` to automatically +fill it in the metadata. Grouping all these uses under the same key forces an +"all or nothing" approach, and creates ambiguity as to user intent. + +There are further downsides to this as well. Both users and tools would need to +keep track of which fields are mutually exclusive with which of the others, +greatly increasing cognitive and code complexity, and in turn the probability +of errors. Conceptually, juxtaposing so many different fields under the +same key is rather jarring, and leads to a much more complex mapping between +PEP 621 keys and core metadata fields, not in keeping with PEP 621. +This causes the PEP 621 naming and structure to diverge further from +both the core metadata and native formats of the various popular packaging +tools that use it. Finally, this results in the spec being significantly more +complex and convoluted to understand and implement than the alternatives. + +The approach this PEP now takes, adding distinct ``license-expression`` and +``license-files`` keys and simply deprecating the whole ``license`` key, avoids +all the issues identified above, and results in a much clearer and cleaner +design overall. It allows ``license`` and ``license-files`` to be tagged +``dynamic`` independently, separates two independent types of metadata +(syntactically and semantically), restores a closer to 1:1 mapping of +PEP 621 keys to core metadata fields, and reduces nesting by a level for both. +Other than adding two extra keys to the file, there was no significant +apparent downside to this latter approach, so it was adopted for this PEP. + + +Define license expression as string value +''''''''''''''''''''''''''''''''''''''''' + +A compromise approach between adding two new top-level keys for license +expressions and files would be adding a separate ``license-files`` key, +but re-using the ``license`` key for the license expression, either by +defining it as the (previously reserved) string value for the ``license`` +key, retaining the ``expression`` subkey in the ``license`` table, or +allowing both. Indeed, this would seem to have been envisioned by PEP 621 +itself with this PEP in mind, in particular the first approach:: + + A practical string value for the license key has been purposefully left + out to allow for a future PEP to specify support for SPDX expressions + (the same logic applies to any sort of "type" field specifying what + license the file or text represents). + +However, while a working draft temporarily explored this solution, it was +ultimately rejected, as it shared most of the downsides identified with +adding new subkeys under the existing ``license`` table, as well as several +of its own, with again minimal advantage over separating both. + +Most importantly, it still means that per PEP 621, it is not possible to +separately mark the ``[project]`` keys corresponding to the ``License`` and +``License-Expression`` metadata fields as dynamic. This, in turn, still +renders specifying metadata following that standard incompatible with +conversion of legacy metadata, as specified in this PEP's +`Converting legacy metadata`_ section, as PEP 621 strictly prohibits the +``license`` key from being both present (to define the existing value of +the ``License`` field, or the path to a license file, and thus able to be +converted), and specified as ``dynamic`` (which would allow tools to +use the generated value for the ``License-Expression`` field. + +For the same reasons, this would make it impossible to back-fill the +``License`` field from the ``License-Expression`` field as this PEP +currently allows (without making an exception from strict +``dynamic`` behavior in this case), as again, marking ``license`` as dynamic +would mean it cannot be specified in the ``project`` table at all. + +Furthermore, this would mean existing project source metadata specifying +``license`` as ``dynamic`` would be ambiguous, as it would be impossible for +tools to statically determine if they are intended to conform to previous +metadata versions specifying ``License``, or this version specifying +``License-Expression``. Tools would have no way of determining which field, +if either, might be filled in the resulting distribution's core metadata. +By contrast, the present approach makes clear what the author intended, +allows tools to unambiguously determine which field(s) may be dynamically +inserted, and ensures backward compatibility such that current project +source metadata do not unknowingly specify both the old and the new field +as dynamic, and instead must do so explicitly per PEP 621's intent. + +Additionally, while differences from existing tool formats (and core metadata +field names) has precedent in PEP 621 (though is best avoided if practical), +using a key with an identical name as in all current tools (and of an existing +core metadata field) to mean something different (and map to a different +core metadata field), with distinct and incompatible syntax and semantics, +does not, and is likely to create substantial and confusion and ambiguity +for readers and authors, contrary to the fundamental goals of this PEP. + +Finally, this means that the top-level ``license`` key still maps to multiple +core metadata fields with different purposes and interpretation (``License`` +and ``License-Expression``), this would deny a clear separation from the +old behavior by not cleanly deprecating the ``license`` key, and +increases the complexity of the specification and implementation. + +In addition to the aforementioned issues, this also requires deciding between +the three individual approaches (``expression`` subkey, top-level string or +allowing both), all of which have further significant downsides and none of +which are clearly superior or more obvious, leading to needless bikeshedding. + +If the license expression was made the string value of the ``license`` key, +as reserved by PEP 621, it would be slightly shorter for users to type and +more obviously the preferred approach. However, it is far *less* obvious that +it is a license expression at all, to authors and those viewing the files, +and this lack of clarity, explicitness, ambiguity and potential for user +confusion is exactly what this PEP seeks to avoid, all to save a few characters +over other approaches. + +If an ``expression`` subkey was added to the ``license`` table, it would retain +the clarity of a new top-level key, but add additional complexity for no +real benefit, with an extra level of nesting, and users and tools needing to +deal with the mutual exclusivity of the subkeys, as before. And allowing both +(as a table subkey *and* the string value) would inherit both's downsides, +while adding even more spec and tool complexity and making there more than +"one obvious way to do it", further potentially confusing users. + +Therefore, a separate top-level ``license-expression`` key was adopted to avoid +all these issues, with relatively minimal downside aside from adding a single +additional key and (versus some approaches) a few extra characters to type. + + +Add a ``type`` key to treat as expression +''''''''''''''''''''''''''''''''''''''''' + +Instead of creating a new top-level ``license-expression`` key in the +PEP 621 source metadata, one could add a ``type`` subkey to the existing +``license`` table to control whether ``text`` (or a string value) +is interpreted as free-text or a license expression. This could make +backward compatibility a little more seamless, as older tools could ignore +it and always treat ``text`` as ``license``, while newer tools would +know to treat it as a license expression, if ``type`` was set appropriately. +Indeed, PEP 621 seems to suggest something of this sort as a possible +alternative way that SPDX license expressions could be implemented. + +However, all the same downsides as in the previous item apply here, +including greater complexity, a more complex mapping between the project +source metadata and core metadata and inconsistency between the presentation +in tool config, PEP 621 and core metadata, a much less clean deprecation, +further bikeshedding over what to name it, and inability to mark one but +not the other as dynamic, among others. + +In addition, while theoretically potentially a little easier in the short +term, in the long term it would mean users would always have to remember +to specify the correct ``type`` to ensure their license expression is +interpreted correctly, which adds work and potential for error; we could +never safety change the default while being confident that users +understand that what they are entering is unambiguously a license expression, +with all the false positive and false negative issues as above. + +Therefore, for these as well as the same reasons this approach was rejected +for the core metadata in favor of a distinct ``License-Expression`` field, +we similarly reject this here. + + +Must be marked dynamic to back-fill +''''''''''''''''''''''''''''''''''' + +The ``license`` key in the ``pyproject.toml`` could be required to be +explicitly set to dynamic in order for the ``License`` core metadata field +to be automatically back-filled from the value of the ``license-expression`` +key. This would be more explicit that the filling will be done, as strictly +speaking the ``license`` key is not (and cannot be) specified in +``pyproject.toml``, and satisfies a stricter interpretation of the letter +of the current PEP 621 specification that this PEP revises. + +However, this isn't seen to be necessary, because it is simply using the +static, verbatim literal value of the ``license-expression`` key, as specified +strictly in this PEP. Therefore, any conforming tool can trivially, +deterministically and unambiguously derive this using only the static data +in the ``pyproject.toml`` file itself. + +Furthermore, this actually adds significant ambiguity, as it means the value +could get filled arbitrarily by other tools, which would in turn compromise +and conflict with the value of the new ``License-Expression`` field, which is +why such is explicitly prohibited by this PEP. Therefore, not marking it as +``dynamic`` will ensure it is only handled in accordance with this PEP's +requirements. + +Finally, users explicitly being told to mark it as ``dynamic``, or not, to +control filling behavior seems to be a bit of a mis-use of the ``dynamic`` +field as apparently intended, and prevents tools from adapting to best +practices (fill, don't fill, etc) as they develop and evolve over time. + + +Source metadata ``license-files`` key +------------------------------------- + +Alternatives considered for the ``license-files`` key in the +PEP 621 project source metadata, primarily related to the +path/glob type handling. + + +Add a ``type`` subkey to ``license-files`` +'''''''''''''''''''''''''''''''''''''''''' + +Instead of defining mutually exclusive ``paths`` and ``globs`` subkeys +of the ``license-files`` PEP 621 project metadata key, we could +achieve the same effect with a ``files`` subkey for the list and +a ``type`` subkey for how to interpret it. However, the latter offers no +real advantage over the former, in exchange for requiring more keystrokes, +verbosity and complexity, as well as less flexibility in allowing both, +or another additional subkey in the future, as well as the need to bikeshed +over the subkey name. Therefore, it was summarily rejected. + + +Only accept verbatim paths +'''''''''''''''''''''''''' + +Globs could be disallowed completely as values to the ``license-files`` +key in ``pyproject.toml`` and only verbatim literal paths allowed. +This would ensure that all license files are explicitly specified, all +specified license files are found and included, and the source metadata +is completely static in the strictest sense of the term, without tools +having to inspect the rest of the project source files to determine exactly +what license files will be included and what the ``License-File`` values +will be. This would also modestly simplify the spec and tool implementation. + +However, practicality once again beats purity here. Globs are supported and +used by many existing tools for finding license files, and explicitly +specifying the full path to every license file would be unnecessarily tedious +for more complex projects with vendored code and dependencies. More +critically, it would make it much easier to accidentally miss a required +legal file, silently rendering the package illegal to distribute. + +Tools can still statically and consistently determine the files to be included, +based only on those glob patterns the user explicitly specified and the +filenames in the package, without installing it, executing its code or even +examining its files. Furthermore, tools are still explicitly allowed to warn +if specified glob patterns (including full paths) don't match any files. +And, of course, sdists, wheels and others will have the full static list +of files specified in their distribution metadata. + +Perhaps most importantly, this would also preclude the currently specified +default value, as widely used by the current most popular tools, and thus +be a major break to backward compatibility, tool consistency, and safe +and sane default functionality to avoid unintentional license violations. +And of course, authors are welcome and encouraged to specify their license +files explicitly via the ``paths`` table subkey, once they are aware of it and +if it is suitable for their project and workflow. + + +Only accept glob patterns +''''''''''''''''''''''''' -A mapping would be needed as you cannot guarantee that all expressions (e.g. -GPL with an exception may be in a single file) or all the license keys have a -single license file and that any expression does not have more than one. (e.g. -an Apache license ``LICENSE`` and its ``NOTICE`` file for instance are two -distinct files). Yet in most cases, there is a simpler "one license", "one or -more license files". In the rarer and more complex cases where there are many -licenses involved you can still use the proposed conventions at the cost of a -slight loss of clarity by not specifying which text file is for which license -identifier, but you are not forcing the more complex data model (e.g. a mapping) -on everyone that may not need it. +Conversely, all ``license-files`` strings could be treated as glob patterns. +This would slightly simplify the spec and implementation, avoid an extra level +of nesting, and more closely match the configuration format of existing tools. + +However, for the cost of a few characters, it ensures users are aware +whether they are entering globs or verbatim paths. Furthermore, allowing +license files to be specified as literal paths avoids edge cases, such as those +containing glob characters (or those confusingly or even maliciously similar +to them, as described in PEP 672). + +Including an explicit ``paths`` value ensures that the resulting +``License-File`` metadata is correct, complete and purely static in the +strictest sense of the term, with all license paths explicitly specified +in the ``pyproject.toml`` file, guaranteed to be included and with an early +error should any be missing. This is not practical to do, at least without +serious limitations for many workflows, if we must assume the items +are glob patterns rather than literal paths. + +This allows tools to locate them and know the exact values of the +``License-File`` core metadata fields without having to traverse the +source tree of the project and match globs, potentially allowing easier, +more efficient and reliable programmatic inspection and processing. + +Therefore, given the relatively small cost and the significant benefits, +this approach was not adopted. + + +Infer whether paths or globs +'''''''''''''''''''''''''''' + +It was considered whether to simply allow specifying an array of strings +directly for the ``license-files`` key, rather than making it a table with +explicit ``paths`` and ``globs``. This would be somewhat simpler and avoid +an extra level of nesting, and more closely match the configuration format +of existing tools. However, it was ultimately rejected in favor of separate, +mutually exclusive ``paths`` and ``globs`` table subkeys. + +In practice, it only saves six extra characters in the ``pyproject.toml`` +(``license-files = [...]`` vs ``license-files.globs = [...]``), but allows +the user to more explicitly declare their intent, ensures they understand how +the values are going to be interpreted, and serves as an unambiguous indicator +for tools to parse them as globs rather than verbatim path literals. + +This, in turn, allows for more appropriate, clearly specified tool +behaviors for each case, many of which would be unreliable or impossible +without it, to avoid common traps, provide more helpful feedback and +behave more sensibly and intuitively overall. These include, with ``paths``, +guaranteeing that each and every specified file is included and immediately +raising an error if one is missing, and with ``globs``, checking glob syntax, +excluding unwanted backup, temporary, or other such files (as current tools +already do), and optionally warning if a glob doesn't match any files. +This also avoids edge cases (e.g. paths that contain glob characters) and +reliance on heuristics to determine interpretation—the very thing this PEP +seeks to avoid. + + +Also allow a flat array value +''''''''''''''''''''''''''''' + +Initially, after deciding to define ``license-files`` as a table of ``paths`` +and ``globs``, thought was given to making a top-level string array under the +``license-files`` key mean one or the other (probably ``globs``, to match most +current tools). This is slightly shorter and simpler, would allow gently +nudging users toward a preferred one, and allow a slightly cleaner handling of +the empty case (which, at present, is treated identically for either). + +However, this again only saves six characters in the best case, and there +isn't an obvious choice; whether from a perspective of preference (both had +clear use cases and benefits), nor as to which one users would naturally +assume. + +Flat may be better than nested, but in the face of ambiguity, users +may not resist the temptation to guess. Requiring users to explicitly specify +one or the other ensures they are aware of how their inputs will be handled, +and is more readable for others, both human and machine alike. It also makes +the spec and tool implementation slightly more complicated, and it can always +be added in the future, but not removed without breaking backward +compatibility. And finally, for the "preferred" option, it means there is +more than one obvious way to do it. + +Therefore, per PEP 20, the Zen of Python, this approach is hereby rejected. + + +Allow both ``paths`` and ``globs`` subkeys +'''''''''''''''''''''''''''''''''''''''''' + +Allowing both ``paths`` and ``globs`` subkeys to be specified under the +``license-files`` table was considered, as it could potentially allow +more flexible handling for particularly complex projects, and specify on a +per-pattern rather than overall basis whether ``license-files`` entries +should be treated as ``paths`` or ``globs``. + +However, given the existing proposed approach already matches or exceeds the +power and capabilities of those offered in tools' config files, there isn't +clear demand for this and few likely cases that would benefit, it adds a large +amount of complexity for relatively minimal gain, in terms of the +specification, in tool implementations and in ``pyproject.toml`` itself. + +There would be many more edge cases to deal with, such as how to handle files +matched by both lists, and it conflicts in multiple places with the current +specification for how tools should behave with one or the other, such as when +no files match, guarantees of all files being included and of the file paths +being explicitly, statically specified, and others. + +Like the previous, if there is a clear need for it, it can be always allowed +in the future in a backward-compatible manner (to the extent it is possible +in the first place), while the same is not true of disallowing it. +Therefore, it was decided to require the two subkeys to be mutually exclusive. + + +Rename ``paths`` subkey to ``files`` +'''''''''''''''''''''''''''''''''''' + +Initially, it was considered whether to name the ``paths`` subkey of the +``license-files`` table ``files`` instead. However, ``paths`` was ultimately +chosen, as calling the table subkey ``files`` resulted in duplication between +the table name (``license-files``) and the subkey name (``files``), i.e. +``license-files.files = ["LICENSE.txt"]``, made it seem like the preferred/ +default subkey when it was not, and lacked the same parallelism with ``globs`` +in describing the format of the string entry rather than what was being +pointed to. + + +Must be marked dynamic to use defaults +'''''''''''''''''''''''''''''''''''''' + +It may seem outwardly sensible, at least with a particularly restrictive +interpretation of PEP 621 's description of the ``dynamic`` list, to +consider requiring the ``license-files`` key to be explicitly marked as +``dynamic`` in order for the default glob patterns to be used, or alternatively +for license files to be matched and included at all. + +However, this is merely declaring a static, strictly-specified default value +for this particular key, required to be used exactly by all conforming tools +(so long as it is not marked ``dynamic``, negating this argument entirely), +and is no less static than any other set of glob patterns the user themself +may specify. Furthermore, the resulting ``License-File`` core metadata values +can still be determined with only a list of files in the source, without +installing or executing any of the code, or even inspecting file contents. + +Moreover, even if this were not so, practicality would trump purity, as this +interpretation would be strictly backwards-incompatible with the existing +format, and be inconsistent with the behavior with the existing tools. +Further, this would create a very serious and likely risk of a large number of +projects unknowingly no longer including legally mandatory license files, +making their distribution technically illegal, and is thus not a sane, +much less sensible default. + +Finally, aside from adding an additional line of default-required boilerplate +to the file, not defining the default as dynamic allows authors to clearly +and unambiguously indicate when their build/packaging tools are going to be +handling the inclusion of license files themselves rather than strictly +conforming to the PEP 621 portions of this PEP; to do otherwise would defeat +the primary purpose of the ``dynamic`` list as a marker and escape hatch. + + +License file paths +------------------ + +Alternatives related to the paths and locations of license files in the source +and built distributions. + + +Flatten license files in subdirectories +''''''''''''''''''''''''''''''''''''''' + +Previous drafts of this PEP were silent on the issue of handling license files +in subdirectories. Currently, the `Wheel <#wheelfiles_>`_ and (following its +example) `Setuptools <#setuptoolsfiles_>`_ projects flatten all license files +into the ``.dist-info`` directory without preserving the source subdirectory +hierarchy. + +While this is the simplest approach and matches existing ad hoc practice, +this can result in name conflicts and license files clobbering others, +with no obvious defined behavior for how to resolve them, and leaving the +package legally un-distributable without any clear indication to users that +their specified license files have not been included. + +Furthermore, this leads to inconsistent relative file paths for non-root +license files between the source, sdist and wheel, and prevents the paths +given in the PEP 621 "static" metadata from being truly static, as they need +to be flattened, and may potentially overwrite one another. Finally, +the source directory structure often implies valuable information about +what the licenses apply to, and where to find them in the source, +which is lost when flattening them and far from trivial to reconstruct. + +To resolve this, the PEP now proposes, as did contributors on both of the +above issues, reproducing the source directory structure of the original +license files inside the ``.dist-info`` directory. This would fully resolve the +concerns above, with the only downside being a more nested ``.dist-info`` +directory. There is still a risk of collision with edge-case custom +filenames (e.g. ``RECORD``, ``METADATA``), but that is also the case +with the previous approach, and in fact with fewer files flattened +into the root, this would actually reduce the risk. Furthermore, +the following proposal rooting the license files under a ``license_files`` +subdirectory eliminates both collisions and the clutter problem entirely. + + +Resolve name conflicts differently +'''''''''''''''''''''''''''''''''' + +Rather than preserving the source directory structure for license files +inside the ``.dist-info`` directory, we could specify some other mechanism +for conflict resolution, such as pre- or appending the parent directory name +to the license filename, traversing up the tree until the name was unique, +to avoid excessively nested directories. + +However, this would not address the path consistency issues, would require +much more discussion, coordination and bikeshedding, and further complicate +the specification and the implementations. Therefore, it was rejected in +favor of the simpler and more obvious solution of just preserving the +source subdirectory layout, as many stakeholders have already advocated for. + + +Dump directly in ``.dist-info`` +''''''''''''''''''''''''''''''' + +Previously, the included license files were stored directly in the top-level +``.dist-info`` directory of built wheels and installed projects. This followed +existing ad hoc practice, ensured most existing wheels currently using this +feature will match new ones, and kept the specification simpler, with the +license files always being stored in the same location relative to the core +metadata regardless of distribution type. + +However, this leads to a more cluttered ``.dist-info`` directory, littered +with arbitrary license files and subdirectories, as opposed to separating +licenses into their own namespace (which per the Zen of Python, PEP 20, are +"one honking great idea"). While currently small, there is still a +risk of collision with specific custom license filenames +(e.g. ``RECORD``, ``METADATA``) in the ``.dist-info`` directory, which +would only increase if and when additional files were specified here, and +would require carefully limiting the potential filenames used to avoid +likely conflicts with those of license-related files. Finally, +putting licenses into their own specified subdirectory would allow +humans and tools to quickly, easily and correctly list, copy and manipulate +all of them at once (such as in distro packaging, legal checks, etc) +without having to reference each of their paths from the core metadata. + +Therefore, now is a prudent time to specify an alternate approach. +The simplest and most obvious solution, as suggested by several on the Wheel +and Setuptools implementation issues, is to simply root the license files +relative to a ``license_files`` subdirectory of ``.dist-info``. This is simple +to implement and solves all the problems noted here, without clear significant +drawbacks relative to other more complex options. + +It does make the specification a bit more complex and less elegant, but +implementation should remain equally simple. It does mean that wheels +produced with following this change will have differently-located licenses +than those prior, but as this was already true for those in subdirectories, +and until this PEP there was no way of discovering these files or +accessing them programmatically, this doesn't seem likely to pose +significant problems in practice. Given this will be much harder if not +impossible to change later, once the status quo is standardized, tools are +relying on the current behavior and there is much greater uptake of not +only simply including license files but potentially accessing them as well +using the core metadata, if we're going to change it, now would be the time +(particularly since we're already introducing an edge-case change with how +license files in subdirs are handled, along with other refinements). + +Therefore, the latter has been incorporated into current drafts of this PEP. + + +Add new ``licenses`` category to wheel +'''''''''''''''''''''''''''''''''''''' + +Instead of defining a root license directory (``license_files``) inside +the core metadata directory (``.dist-info``) for wheels, we could instead +define a new category (and, presumably, a corresponding install scheme), +similar to the others currently included under ``.data`` in the wheel archive, +specifically for license files, called (e.g.) ``licenses``. This was mentioned +by the wheel creator, and would allow installing licenses somewhere more +platform-appropriate and flexible than just the ``.dist-info`` directory +in the site path, and potentially be conceptually cleaner than including +them there. + +However, at present, this PEP does not implement this idea, and it is +deferred to a future one. It would add significant complexity and friction +to this PEP, being primarily concerned with standardizing existing practice +and updating the core metadata specification. Furthermore, doing so would +likely require modifying ``sysconfig`` and the install schemes specified +therein, alongside Wheel, Installer and other tools, which would be a +non-trivial undertaking. While potentially slightly more complex for +repackagers (such as those for Linux distributions), the current proposal still +ensures all license files are included, and in a single dedicated directory +(which can easily be copied or relocated downstream), and thus should still +greatly improve the status quo in this regard without the attendant complexity. + +In addition, this approach is not fully backwards compatible (since it +isn't transparent to tools that simply extract the wheel), is a greater +departure from existing practice and would lead to more inconsistent +license install locations from wheels of different versions. Finally, +this would mean licenses would not be installed as proximately to their +associated code, there would be more variability in the license root path +across platforms and between built distributions and installed projects, +accessing installed licenses programmatically would be more difficult, and a +suitable install location and method would need to be created, discussed +and decided that would avoid name clashes. + +Therefore, to keep this PEP in scope, the current approach was retained. + + +Name the subdirectory ``licenses`` +'''''''''''''''''''''''''''''''''' + +Both ``licenses`` and ``license_files`` have been suggested as potential +names for the root license directory inside ``.dist-info`` of wheels and +installed projects. The former is slightly shorter, but the latter is +more clear and unambiguous regarding its contents, and is consistent with +the name of the core metadata field (``License-File``) and the PEP 621 +project source metadata key (``license-files``). Therefore, the latter +was chosen instead. + + +Other ideas +----------- + +Miscellaneous proposals, possibilities and discussion points that were +ultimately not adopted. + + +Map identifiers to license files +'''''''''''''''''''''''''''''''' + +This would require using a mapping (as two parallel lists would be too prone to +alignment errors), which would add extra complexity to how license +are documented and add an additional nesting level. + +A mapping would be needed, as it cannot be guaranteed that all expressions +(keys) have a single license file associated with them (e.g. +GPL with an exception may be in a single file) and that any expression +does not have more than one. (e.g. an Apache license ``LICENSE`` and +its ``NOTICE`` file, for instance, are two distinct files). +For most common cases, a single license expression and one or more license +files would be perfectly adequate. In the rarer and more complex cases where +there are many licenses involved, authors can still safety use the fields +specified here, just with a slight loss of clarity by not specifying which +text file(s) map to which license identifier (though this should be clear in +practice given each license identifier has corresponding SPDX-registered +full license text), while not forcing the more complex data model +(a mapping) on the large majority of users who do not need or want it. + +We could of course have a data field with multiple possible value types (it's a +string, it's a list, it's a mapping!) but this could be a source of confusion. +This is what has been done, for instance, in npm (historically) and in Rubygems +(still today), and as result tools need to test the type of the metadata field +before using it in code, while users are confused about when to use a list or a +string. Therefore, this approach is rejected. + + +Map identifiers to source files +''''''''''''''''''''''''''''''' + +As discussed previously, file-level notices are out of scope for this PEP, +and the existing ``SPDX-License-Identifier`` `convention <#spdxid_>`_ can +already be used if this is needed without further specification here. + + +Don't freeze compatibility with a specific SPDX version +''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +This PEP could omit specifying a specific SPDX specification version, +or one for the list of valid license identifiers, which would allow +more flexible updates as the specification evolves without another +PEP or equivalent. + +However, serious concerns were expressed about a future SPDX update breaking +compatibility with existing expressions and identifiers, leaving current +packages with invalid metadata per the definition in this PEP. Requiring +compatibility with a specific version of these specifications here +and a PEP or similar process to update it avoids this contingency, +and follows the practice of other packaging ecosystems. + +Therefore, it was `decided <#spdxversion_>`_ to specify a minimum version +and requires tools to be compatible with it, while still allowing updates +so long as they don't break backward compatibility. This enables +tools to immediate take advantage of improvements and accept new +licenses, but also remain backwards compatible with the version +specified here, balancing flexibility and compatibility. + + +Different licenses for source and binary distributions +'''''''''''''''''''''''''''''''''''''''''''''''''''''' + +As an additional use case, it was asked whether it was in scope for this +PEP to handle cases where the license expression for a binary distribution +(wheel) is different from that for a source distribution (sdist), such +as in cases of non-pure-Python packages that compile and bundle binaries +under different licenses than the project itself. An example cited was +`PyTorch <#pytorch_>`_, which contains CUDA from Nvidia, which is freely +distributable but not open source. `NumPy <#numpyissue_>`_ and +`SciPy <#scipyissue_>`_ also had similar issues, as reported by the +original author of this PEP and now resolved for those cases. + +However, given the inherent complexity here and a lack of an obvious +mechanism to do so, the fact that each wheel would need its own license +information, lack of support on PyPI for exposing license info on a +per-distribution archive basis, and the relatively niche use case, it was +determined to be out of scope for this PEP, and left to a future PEP +to resolve if sufficient need and interest exists and an appropriate +mechanism can be found. + + +Open Issues +=========== + +Should the ``License`` field be back-filled, or mutually exclusive? +------------------------------------------------------------------- + +At present, this PEP explicitly allows, but does not formally recommend or +require, build tools to back-fill the ``License`` core metadata field with +the verbatim text from the ``License-Expression`` field. This would +presumably improve backwards compatibility and was suggested +by some on the Discourse thread. On the other hand, allowing it does +increase complexity and is less of a clean, consistent separation, +preventing the ``License`` field from being completely mutually exclusive +with the new ``License-Expression`` field and requiring that their values +match. + +As such, it would be very useful to have a more concrete and specific +rationale and use cases for the back-filled data, and give fuller +consideration to any potential benefits or drawbacks of this approach, +in order to come to a final consensus on this matter that can be appropriately +justified here. + +Therefore, is the status quo expressed here acceptable, allowing tools +leeway to decide this for themselves? Should this PEP formally recommend, +or even require, that tools back-fill this metadata (which would presumably +be reversed once a breaking revision of the metadata spec is issued)? +Or should this not be explicitly allowed, discouraged or even prohibited? + + +Should custom license identifiers be allowed? +--------------------------------------------- + +The current version of this PEP retains the behavior of only specifying +the use of SPDX-defined license identifiers, as well as the explicitly defined +custom identifiers ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` +to handle the two common cases where projects have a license, but it is not +one that has a recognized SPDX license identifier. + +For maximum flexibility, custom ``LicenseRef-`` license +identifiers could be allowed, which could potentially be useful for niche +cases or corporate environments where ``LicenseRef-Proprietary`` is not +appropriate or insufficiently specific, but relying on mainstream Python +build tooling and the ``License-Expression`` metadata field is still +desirable to use for this purpose. + +This has the downsides, however, of not catching misspellings of the +canonically defined license identifiers and thus producing license metadata +that is not a valid match for what the author intended, as well as users +potentially thinking they have to prepend ``LicenseRef`` in front of valid +license identifiers, as there seems to be some previous confusion about. +Furthermore, this encourages the proliferation of bespoke license identifiers, +which obviates the purpose of enabling clear, unambiguous and well +understood license metadata for which this PEP was created. + +Indeed, for niche cases that need specific, proprietary custom licenses, +they could always simply specify ``LicenseRef-Proprietary``, and then +include the actual license files needed to unambiguously identify the license +regardless (if not using SPDX license identifiers) under the ``License-File`` +fields. Requiring standards-conforming tools to allow custom license +identifiers does not seem very useful, since standard tools will not recognize +bespoke ones or know how to treat them. By contrast, bespoke tools, which +would be required in any case to understand and act on custom identifiers, +are explicitly allowed, with good reason (thus the ``SHOULD`` keyword) +to not require that license identifiers conform to those listed here. +Therefore, this specification still allows such use in private corporate +environments or specific ecosystems, while avoiding the disadvantages of +imposing them on all mainstream packaging tools. + +As an alternative, a literal ``LicenseRef-Custom`` identifier could be +defined, which would more explicitly indicate that the license cannot be +expressed with defined identifiers and the license text should be referenced +for details, without carrying the negative and potentially inappropriate +implications of ``LicenseRef-Proprietary``. This would avoid the main +mentioned downsides (misspellings, confusion, license proliferation) of +the approve approach of allowing an arbitrary ``LicenseRef``, while +addressing several of the potential theoretical scenarios cited for it. + +On the other hand, as SPDX aims to (and generally does) encompass all +FSF-recognized "Free" and OSI-approved "Open Source" licenses, +and those sources are kept closely in sync and are now relatively stable, +anything outside those bounds would generally be covered by +``LicenseRef-Proprietary``, thus making ``LicenseRef-Custom`` less specific +in that regard, and somewhat redundant to it. Furthermore, it may mislead +authors of projects with complex/multiple licenses that they should use it +over specifying a license expression. + +At present, the PEP retains the existing approach over either of these, given +the use cases and benefits were judged to be sufficiently marginal based +on the current understanding of the packaging landscape. For both these +proposals, however, if more concrete use cases emerge, this can certainly +be reconsidered, either for this current PEP or a future one (before or +in tandem with actually removing the legacy unstructured ``License`` +metadata field). Not defining this now enables allowing it later +(or still now, with custom packaging tools), without affecting backward +compatibility, while the same is not so if they are allowed now and later +determined to be unnecessary or too problematic in practice. + + +Appendix 1. License Expression Examples +======================================= + +Basic example +------------- + +The Setuptools project itself, as of `version 59.1.1 <#setuptools5911_>`_, +does not use the ``License`` field in its own project source metadata. +Further, it no longer explicitly specifies ``license_file``/``license_files`` +as it did previously, since Setuptools relies on its own automatic +inclusion of license-related files matching common patterns, +such as the ``LICENSE`` file it uses. + +It includes the following license-related metadata in its ``setup.cfg``:: + + [metadata] + classifiers = + License :: OSI Approved :: MIT License -We could of course have a data field with multiple possible value types (it’s a -string, it’s a list, it’s a mapping!) but this could be a source of confusion. -This is what has been done for instance in npm (historically) and in Rubygems -(still today) and as result you need to test the type of the metadata field -before using it in code and users are confused about when to use a list or a -string. +The simplest migration to this PEP would consist of using this instead:: + [metadata] + license_expression = MIT -3. Mapping licenses to specific source files and/or directories of source files - (or vice versa). +Or, in a PEP 621 ``pyproject.toml``:: -File-level notices are not considered as part of the scope of this PEP and the -existing ``SPDX-License-Identifier`` [#spdxids]_ convention can be used and -may not need further specification as a PEP. + [project] + license-expression = "MIT" +The output core metadata for the distribution packages would then be:: -Appendix 1. License Expression example -====================================== + License-Expression: MIT + License-File: LICENSE -The current version of ``setuptools`` metadata [#setuptools5030]_ does not use -the ``License`` field. It uses instead this license-related information in -``setup.cfg``:: +The ``LICENSE`` file would be stored at ``/setuptools-{version}/LICENSE`` +in the sdist and ``/setuptools-{version}.dist-info/license_files/LICENSE`` +in the wheel, and unpacked from there into the site directory (e.g. +``site-packages``) on installation; ``/`` is the root of the respective archive +and ``{version}`` the version of the Setuptools release in the core metadata. - license_file = LICENSE - classifiers = - License :: OSI Approved :: MIT License -The simplest migration to this PEP would consist of using this instead:: - - license = MIT - license_files = - LICENSE +Advanced example +---------------- -Another possibility would be to include the licenses of the third-party packages +Suppose Setuptools were to include the licenses of the third-party projects that are vendored in the ``setuptools/_vendor/`` and ``pkg_resources/_vendor`` -directories:: +directories; specifically:: - appdirs==1.4.3 - packaging==20.4 + packaging==21.2 pyparsing==2.2.1 ordered-set==3.1.1 + more_itertools==8.8.0 -These license expressions for these packages are:: +The license expressions for these projects are:: - appdirs: MIT packaging: Apache-2.0 OR BSD-2-Clause pyparsing: MIT ordered-set: MIT + more_itertools: MIT + +A comprehensive license expression covering both Setuptools +proper and its vendored dependencies would contain these metadata, +combining all the license expressions into one. Such an expression might be:: + + MIT AND (Apache-2.0 OR BSD-2-Clause) -Therefore, a comprehensive license expression covering both ``setuptools`` proper -and its vendored packages could contain these metadata, combining all the -license expressions in one expression:: +In addition, per the requirements of the licenses, the relevant license files +must be included in the package. Suppose the ``LICENSE`` file contains the text +of the MIT license and the copyrights used by Setuptools, ``pyparsing``, +``more_itertools`` and ``ordered-set``; and the ``LICENSE*`` files in the +``setuptools/_vendor/packaging/`` directory contain the Apache 2.0 and +2-clause BSD license text, and the Packaging copyright statement and +`license choice notice <#packaginglicense_>`_. - license = MIT AND (Apache-2.0 OR BSD-2-Clause) +Specifically, we assume the license files are located at the following +paths in the project source tree (relative to the project root and +``pyproject.toml``):: + + LICENSE + setuptools/_vendor/packaging/LICENSE + setuptools/_vendor/packaging/LICENSE.APACHE + setuptools/_vendor/packaging/LICENSE.BSD + +Putting it all together, our ``setup.cfg`` would be:: + + [metadata] + license_expression = MIT AND (Apache-2.0 OR BSD-2-Clause) license_files = - LICENSE.MIT - LICENSE.packaging + LICENSE + setuptools/_vendor/packaging/LICENSE + setuptools/_vendor/packaging/LICENSE.APACHE + setuptools/_vendor/packaging/LICENSE.BSD -Here we would assume that the ``LICENSE.MIT`` file contains the text of the MIT -license and the copyrights used by ``setuptools``, ``appdirs``, ``pyparsing`` and -``ordered-set``, and that the ``LICENSE.packaging`` file contains the texts of the -Apache and BSD license, its copyrights and its license choice notice [#packlic]_. +In a PEP 621 ``pyproject.toml``, with license files specified explicitly +via the ``paths`` subkey, this would look like:: + [project] + license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" + license-files.paths = [ + "LICENSE", + "setuptools/_vendor/LICENSE", + "setuptools/_vendor/LICENSE.APACHE", + "setuptools/_vendor/LICENSE.BSD", + ] -Appendix 2. Surveying how we document licenses today in Python -============================================================== +Or alternatively, matched via glob patterns, this could be:: -There are multiple ways used or recommended to document Python package -licenses today: + [project] + license-expression = "MIT AND (Apache-2.0 OR BSD-2-Clause)" + license-files.globs = [ + "LICENSE*", + "setuptools/_vendor/LICENSE*", + ] +With either approach, the output core metadata in the distribution +would be:: -In Core metadata ----------------- + License-Expression: MIT AND (Apache-2.0 OR BSD-2-Clause) + License-File: LICENSE + License-File: setuptools/_vendor/packaging/LICENSE + License-File: setuptools/_vendor/packaging/LICENSE.APACHE + License-File: setuptools/_vendor/packaging/LICENSE.BSD + +In the resulting sdist, with ``/`` as the root of the archive and ``{version}`` +the version of the Setuptools release specified in the core metadata, +the license files would be located at the paths:: + + /setuptools-{version}/LICENSE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools-{version}/setuptools/_vendor/packaging/LICENSE.BSD + +In the built wheel, with ``/`` being the root of the archive and +``{version}`` as the previous, the license files would be stored at:: + + /setuptools-{version}.dist-info/license_files/LICENSE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.APACHE + /setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD + +Finally, in the installed project, with ``site-packages`` being the site dir +and ``{version}`` as the previous, the license files would be installed to:: + + site-packages/setuptools-{version}.dist-info/license_files/LICENSE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.APACHE + site-packages/setuptools-{version}.dist-info/license_files/setuptools/_vendor/packaging/LICENSE.BSD + + +Conversion example +------------------ + +Suppose we were to return to our simple Setuptools case. +Per the specification, given it only has the following license classifier:: + + Classifier: License :: OSI Approved :: MIT License + +And no value for the ``License`` field, or equivalently, if it had a +value of:: + + License: MIT + +Then the suggested value for the ``License-Expression`` field would be:: + + License-Expression: MIT + +For the more complex case, assuming it was currently expressed as multiple +license classifiers, no automatic conversion could be performed due to the +inherent ambiguity, and the user would be prompted on how to handle the +situation themselves. + + +Expression examples +------------------- + +Some additional examples of valid ``License-Expression`` values:: + + License-Expression: MIT + + License-Expression: BSD-3-Clause + + License-Expression: MIT AND (Apache-2.0 OR BSD-2-clause) + + License-Expression: MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) + + License-Expression: GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause + + License-Expression: LicenseRef-Public-Domain OR CC0-1.0 OR Unlicense + + License-Expression: LicenseRef-Proprietary + + +Appendix 2. License Documentation in Python +=========================================== + +There are multiple ways used or recommended to document Python project +licenses today. The most common are listed below. + + +Core metadata +------------- There are two overlapping core metadata fields to document a license: the -license-related ``Classifier`` strings [#classif]_ prefixed with ``License ::`` and -the ``License`` field as free text [#licfield]_. +license ``Classifier`` `strings <#classifiers_>`_ prefixed with ``License ::`` +and the ``License`` `field <#licensefield_>`_ as free text. +The core metadata ``License`` field documentation is currently:: -The core metadata documentation ``License`` field documentation is currently:: + License + ======= - License (optional) - :::::::::::::::::: + .. versionadded:: 1.0 Text indicating the license covering the distribution where the license is not a selection from the "License" Trove classifiers. See - "Classifier" below. This field may also be used to specify a + :ref:`"Classifier" ` below. + This field may also be used to specify a particular version of a license which is named via the ``Classifier`` field, or to indicate a variation or exception to such a license. @@ -550,339 +2329,375 @@ The core metadata documentation ``License`` field documentation is currently:: License: GPL version 3, excluding DRM provisions Even though there are two fields, it is at times difficult to convey anything -but simpler licensing. For instance some classifiers lack accuracy (GPL -without a version) and when you have multiple License-related classifiers it is -not clear if this is a choice or all these apply and which ones. Furthermore, -the list of available license-related classifiers is often out-of-date. +but simpler licensing. For instance, some classifiers lack precision +(GPL without a version) and when multiple license classifiers are +listed, it is not clear if both licenses must apply, or the user may choose +between them. Furthermore, the list of available license classifiers +is rather limited and out-of-date. -In the PyPA ``sampleproject`` ------------------------------ +Setuptools and Wheel +-------------------- -The latest PyPA ``sampleproject`` recommends only to use classifiers in -``setup.py`` and does not list the ``license`` field in its example -``setup.py`` [#samplesetup]_. +Beyond a license code or qualifier, license text files are documented and +included in a built package either implicitly or explicitly, +and this is another possible source of confusion: + +- In the `Setuptools <#setuptoolssdist_>`_ and `Wheel <#wheels_>`_ projects, + license files are automatically added to the distribution (at their source + location in a source distribution/sdist, and in the ``.dist-info`` + directory of a built wheel) if they match one of a number of common license + file name patterns (``LICEN[CS]E*``, ``COPYING*``, ``NOTICE*`` and + ``AUTHORS*``). Alternatively, a package author can specify a list of license + file paths to include in the built wheel under the ``license_files`` key in + the ``[metadata]`` section of the project's ``setup.cfg``, or as an argument + to the ``setuptools.setup()`` function. At present, following the Wheel + project's lead, Setuptools flattens the collected license files into the + metadata directory, clobbering files with the same name, and dumps license + files directly into the top-level ``.dist-info`` directory, but there is a + `desire to resolve both these issues <#setuptoolsfiles_>`_, + contingent on this PEP being accepted. + +- Both tools also support an older, singular ``license_file`` parameter that + allows specifying only one license file to add to the distribution, which + has been deprecated for some time but still sees `some use <#pipsetup_>`_. + +- Following the publication of an earlier draft of this PEP, Setuptools + `added support <#setuptoolspep639_>`_ for ``License-File`` in distribution + metadata as described in this specification. This allows other tools + consuming the resulting metadata to unambiguously locate the license file(s) + for a given package. + + +PyPA Packaging Guide and Sample Project +--------------------------------------- +Both the `PyPA beginner packaging tutorial <#packagingtuttxt_>`_ and its more +comprehensive `packaging guide <#packagingguidetxt_>`_ state that it is +important that every package include a license file. They point to the +``LICENSE.txt`` in the official PyPA sample project as an example, which is +`explicitly listed <#samplesetupcfg_>`_ under the ``license_files`` key in +its ``setup.cfg``, following existing practice formally specified by this PEP. -The License Files in wheels and setuptools ------------------------------------------- +Both the `beginner packaging tutorial <#packagingtutkey_>`_ and the +`sample project <#samplesetuppy_>`_ only use classifiers to declare a +package's license, and do not include or mention the ``License`` field. +The `full packaging guide <#licensefield_>`_ does mention this field, but +states that authors should use the license classifiers instead, unless the +project uses a non-standard license (which the guide discourages). -Beyond a license code or qualifier, license text files are documented and -included in a built package either implicitly or explicitly and this is another -possible source of confusion: - -- In wheels [#wheels]_ license files are automatically added to the ``.dist-info`` - directory if they match one of a few common license file name patterns (such - as LICENSE*, COPYING*). Alternatively a package author can specify a list of - license file paths to include in the built wheel using in the - ``license_files`` field in the ``[metadata]`` section of the project's - ``setup.cfg``. Previously this was a (singular) ``license_file`` file attribute - that is now deprecated but is still in common use. See [#pipsetup]_ for - instance. - -- In ``setuptools`` [#setuptoolssdist]_, a ``license_file`` attribute is used to add - a single license file to a source distribution. This singular version is - still honored by ``wheels`` for backward compatibility. - -- Using a LICENSE.txt file is encouraged in the packaging guide [#packaging]_ - paired with a ``MANIFEST.in`` entry to ensure that the license file is included - in a built source distribution (sdist). - -Note: the License-File field proposed in this PEP already exists in ``wheel`` and -``setuptools`` with the same behaviour as explained above. This PEP is only -recognizing and documenting the existing practice as used in ``wheel`` (with the -``license_file`` and ``license_files`` ``setup.cfg`` ``[metadata]`` entries) and in -``setuptools`` ``license_file`` ``setup()`` argument. - - -In Python code files --------------------- -(Note: Documenting licenses in source code is not in the scope of this PEP) +Python source code files +------------------------ -Beside using comments and/or ``SPDX-License-Identifier`` conventions, the license -is sometimes documented in Python code files using "dunder" variables typically -named after one of the lower cased Core Metadata fields such as ``__license__`` -[#pycode]_. +**Note:** Documenting licenses in source code is not in the scope of this PEP. -This convention (dunder global variables) is recognized by the built-in ``help()`` -function and the standard ``pydoc`` module. The dunder variable(s) will show up in -the ``help()`` DATA section for a module. +Beside using comments and/or ``SPDX-License-Identifier`` conventions, the +license is `sometimes <#pycode_>`_ documented in Python code files using +a "dunder" module-level constant, typically named ``__license__``. +This convention, while perhaps somewhat antiquated, is recognized by the +built-in ``help()`` function and the standard ``pydoc`` module. +The dunder variable will show up in the ``help()`` DATA section for a module. -In some other Python packaging tools ------------------------------------- -- Conda package manifest [#conda]_ has support for ``license`` and ``license_file`` - fields as well as a ``license_family`` license grouping field. +Other Python packaging tools +---------------------------- -- ``flit`` [#flit]_ recommends to use classifiers instead of License (as per the - current metadata spec). +- `Conda package manifests <#conda_>`_ have support for ``license`` and + ``license_file`` fields, and automatically include license files + following similar naming patterns as the Wheel and Setuptools projects. -- ``pbr`` [#pbr]_ uses similar data as setuptools but always stored setup.cfg. +- `Flit <#flit_>`_ recommends using classifiers instead of the ``License`` + field (per the current PyPA packaging guide). -- ``poetry`` [#poetry]_ specifies the use of the ``license`` field in +- `PBR <#pbr_>`_ uses similar data as Setuptools, but always stored in + ``setup.cfg``. + +- `Poetry <#poetry_>`_ specifies the use of the ``license`` field in ``pyproject.toml`` with SPDX license identifiers. -Appendix 3. Surveying how other package formats document licenses -================================================================= +Appendix 3. License Documentation in Other Projects +=================================================== Here is a survey of how things are done elsewhere. -License in Linux distribution packages ---------------------------------------- -Note: in most cases the license texts of the most common licenses are included -globally once in a shared documentation directory (e.g. /usr/share/doc). +Linux distribution packages +--------------------------- + +**Note:** in most cases, the texts of the most common licenses are included +globally in a shared documentation directory (e.g. ``/usr/share/doc``). -- Debian document package licenses with machine readable copyright files - [#dep5]_. This specification defines its own license expression syntax that is - very similar to the SDPX syntax and use its own list of license identifiers - for common licenses (also closely related to SPDX identifiers). +- Debian documents package licenses with + `machine readable copyright files <#dep5_>`_. + It defines its own license expression syntax and list of identifiers for + common licenses, both of which are closely related to those of SPDX. -- Fedora packages [#fedora]_ specify how to include ``License Texts`` - [#fedoratext]_ and how use a ``License`` field [#fedoralic]_ that must be filled - with an appropriate license Short License identifier(s) from an extensive list - of "Good Licenses" identifiers [#fedoralist]_. Fedora also defines its own - license expression syntax very similar to the SDPX syntax. +- `Fedora packages <#fedora_>`_ specify how to include + `License Texts <#fedoratext_>`_ and use a + `License field <#fedoralicense_>`_ that must be filled + with appropriate short license identifier(s) from an extensive list + of `"Good Licenses" <#fedoralist_>`_. Fedora also defines its own + license expression syntax, similar to that of SPDX. -- openSUSE packages [#opensuse]_ use SPDX license expressions with - SPDX license identifiers and a list of extra license identifiers - [#opensuselist]_. +- `OpenSUSE packages <#opensuse_>`_ use SPDX license expressions with + SPDX license IDs and a + `list of additional license identifiers <#opensuselist_>`_. -- Gentoo ebuild uses a ``LICENSE`` variable [#gentoo]_. This field is specified - in GLEP-0023 [#glep23]_ and in the Gentoo development manual [#gentoodev]_. - Gentoo also defines a license expression syntax and a list of allowed - licenses. The expression syntax is rather different from SPDX. +- `Gentoo ebuild <#pycode_>`_ uses a ``LICENSE`` variable. + This field is specified in `GLEP-0023 <#glep23_>`_ and in the + `Gentoo development manual <#gentoodev_>`_. + Gentoo also defines a list of allowed licenses and a license expression + syntax, which is rather different from SPDX. -- FreeBSD package Makefile [#freebsd]_ provides ``LICENSE`` and +- The `FreeBSD package Makefile <#freebsd_>`_ provides ``LICENSE`` and ``LICENSE_FILE`` fields with a list of custom license symbols. For - non-standard licenses, FreeBSD recommend to use ``LICENSE=UNKNOWN`` and add - ``LICENSE_NAME`` and ``LICENSE_TEXT`` fields, as well as sophisticated + non-standard licenses, FreeBSD recommends using ``LICENSE=UNKNOWN`` and + adding ``LICENSE_NAME`` and ``LICENSE_TEXT`` fields, as well as sophisticated ``LICENSE_PERMS`` to qualify the license permissions and ``LICENSE_GROUPS`` - to document a license grouping. The ``LICENSE_COMB`` allows to document more + to document a license grouping. The ``LICENSE_COMB`` allows documenting more than one license and how they apply together, forming a custom license expression syntax. FreeBSD also recommends the use of ``SPDX-License-Identifier`` in source code files. -- Archlinux PKGBUILD [#archinux]_ define its own license identifiers - [#archlinuxlist]_. The value ``'unknown'`` can be used if the license is not - defined. +- `Arch Linux PKGBUILD <#archinux_>`_ defines its + `own license identifiers <#archlinuxlist_>`_. + The value ``'unknown'`` can be used if the license is not defined. -- OpenWRT ipk packages [#openwrt]_ use the ``PKG_LICENSE`` and +- `OpenWRT ipk packages <#openwrt_>`_ use the ``PKG_LICENSE`` and ``PKG_LICENSE_FILES`` variables and recommend the use of SPDX License identifiers. -- NixOS uses SPDX identifiers [#nixos]_ and some extra license identifiers in - its license field. +- `NixOS uses SPDX identifiers <#nixos_>`_ and some extra license IDs + in its license field. -- GNU Guix (based on NixOS) has a single License field, uses its own license - symbols list [#guix]_ and specifies to use one license or a list of licenses - [#guixlic]_. +- GNU Guix (based on NixOS) has a single License field, uses its own + `license symbols list <#guix_>`_ and specifies how to use one license or a + `list of them <#guixlicense_>`_. -- Alpine Linux packages [#alpine]_ recommend using SPDX identifiers in the +- `Alpine Linux packages <#alpine_>`_ recommend using SPDX identifiers in the license field. -License in Language and Application packages --------------------------------------------- +Language and application packages +--------------------------------- -- In Java, Maven POM [#maven]_ defines a ``licenses`` XML tag with a list of license - items each with a name, URL, comments and "distribution" type. This is not - mandatory and the content of each field is not specified. +- In Java, `Maven POM <#maven_>`_ defines a ``licenses`` XML tag with a list + of licenses, each with a name, URL, comments and "distribution" type. + This is not mandatory, and the content of each field is not specified. -- JavaScript npm package.json [#npm]_ use a single license field with SPDX - license expression or the ``UNLICENSED`` id if no license is specified. - A license file can be referenced as an alternative using "SEE LICENSE IN - " in the single ``license`` field. +- The `JavaScript NPM package.json <#npm_>`_ uses a single license field with + a SPDX license expression, or the ``UNLICENSED`` ID if none is specified. + A license file can be referenced as an alternative using + ``SEE LICENSE IN `` in the single ``license`` field. -- Rubygems gemspec [#gem]_ specifies either a singular license string or a list - of license strings. The relationship between multiple licenses in a list is - not specified. They recommend using SPDX license identifiers. +- `Rubygems gemspec <#gem_>`_ specifies either a single or list of license + strings. The relationship between multiple licenses in a + list is not specified. They recommend using SPDX license identifiers. -- CPAN Perl modules [#perl]_ use a single license field which is either a single - string or a list of strings. The relationship between the licenses in a list - is not specified. There is a list of custom license identifiers plus +- `CPAN Perl modules <#perl_>`_ use a single license field, which is either a + single or a list of strings. The relationship between the licenses in + a list is not specified. There is a list of custom license identifiers plus these generic identifiers: ``open_source``, ``restricted``, ``unrestricted``, ``unknown``. -- Rust Cargo [#cargo]_ specifies the use of an SPDX license expression (v2.1) in - the ``license`` field. It also supports an alternative expression syntax using - slash-separated SPDX license identifiers. There is also a ``license_file`` - field. The crates.io package registry [#cratesio]_ requires that either - ``license`` or ``license_file`` fields are set when you upload a package. - -- PHP Composer composer.json [#composer]_ uses a ``license`` field with an SPDX - license id or "proprietary". The ``license`` field is either a single string - that can use something which resembles the SPDX license expression syntax with - "and" and "or" keywords; or is a list of strings if there is a choice of - licenses (aka. a "disjunctive" choice of license). - -- NuGet packages [#nuget]_ were using only a simple license URL and are now - specifying to use an SPDX License expression and/or the path to a license +- `Rust Cargo <#cargo_>`_ specifies the use of an SPDX license expression + (v2.1) in the ``license`` field. It also supports an alternative expression + syntax using slash-separated SPDX license identifiers, and there is also a + ``license_file`` field. The `crates.io package registry <#cratesio_>`_ + requires that either ``license`` or ``license_file`` fields are set when + uploading a package. + +- `PHP composer.json <#composer_>`_ uses a ``license`` field with + an SPDX license ID or ``proprietary``. The ``license`` field is either a + single string with resembling the SPDX license expression syntax with + ``and`` and ``or`` keywords; or is a list of strings if there is a + (disjunctive) choice of licenses. + +- `NuGet packages <#nuget_>`_ previously used only a simple license URL, but + now specify using a SPDX license expression and/or the path to a license file within the package. The NuGet.org repository states that they only - accepts license expressions that are `approved by the Open Source Initiative - or the Free Software Foundation.` + accept license expressions that are "approved by the Open Source Initiative + or the Free Software Foundation." - Go language modules ``go.mod`` have no provision for any metadata beyond dependencies. Licensing information is left for code authors and other community package managers to document. -- Dart/Flutter spec [#flutter]_ recommends to use a single ``LICENSE`` file - that should contain all the license texts each separated by a line with 80 - hyphens. +- The `Dart/Flutter spec <#flutter_>`_ recommends using a single ``LICENSE`` + file that should contain all the license texts, each separated by a line + with 80 hyphens. -- JavaScript Bower [#bower]_ ``license`` field is either a single string or a list - of strings using either SPDX license identifiers, or a path or a URL to a - license file. +- The `JavaScript Bower <#bower_>`_ ``license`` field is either a single string + or list of strings using either SPDX license identifiers, or a path/URL + to a license file. -- Cocoapods podspec [#cocoapod]_ ``license`` field is either a single string or a - mapping with attributes of type, file and text keys. This is mandatory unless - there is a LICENSE or LICENCE file provided. +- The `Cocoapods podspec <#cocoapod_>`_ ``license`` field is either a single + string, or a mapping with ``type``, ``file`` and ``text`` keys. + This is mandatory unless there is a ``LICENSE``/``LICENCE`` file provided. -- Haskell Cabal [#cabal]_ accepts an SPDX license expression since version 2.2. - The version of the SPDX license list used is a function of the ``cabal`` version. - The specification also provides a mapping between pre-SPDX Legacy license - Identifiers and SPDX identifiers. Cabal also specifies a ``license-file(s)`` - field that lists license files that will be installed with the package. +- `Haskell Cabal <#cabal_>`_ accepts an SPDX license expression since + version 2.2. The version of the SPDX license list used is a function of + the Cabal version. The specification also provides a mapping between + legacy (pre-SPDX) and SPDX license Identifiers. Cabal also specifies a + ``license-file(s)`` field that lists license files to be installed with + the package. -- Erlang/Elixir mix/hex package [#mix]_ specifies a ``licenses`` field as a - required list of license strings and recommends to use SPDX license +- `Erlang/Elixir mix/hex package <#mix_>`_ specifies a ``licenses`` field as a + required list of license strings, and recommends using SPDX license identifiers. -- D lang dub package [#dub]_ defines its own list of license identifiers and - its own license expression syntax and both are similar to the SPDX conventions. +- `D Langanguage dub packages <#dub_>`_ define their own list of license + identifiers and license expression syntax, similar to the SPDX standard. -- R Package DESCRIPTION [#cran]_ defines its own sophisticated license - expression syntax and list of licenses identifiers. R has a unique way to - support specifiers for license versions such as ``LGPL (>= 2.0, < 3)`` in its - license expression syntax. +- The `R Package DESCRIPTION <#cran_>`_ defines its own sophisticated license + expression syntax and list of licenses identifiers. R has a unique way of + supporting specifiers for license versions (such as ``LGPL (>= 2.0, < 3)``) + in its license expression syntax. -Conventions used by other ecosystems ------------------------------------- +Other ecosystems +---------------- -- ``SPDX-License-Identifier`` [#spdxids]_ is a simple convention to document the - license inside a file. +- The ``SPDX-License-Identifier`` `header <#spdxid_>`_ is a simple + convention to document the license inside a file. -- The Free Software Foundation (FSF) promotes the use of SPDX license identifiers - for clarity in the GPL and other versioned free software licenses [#gnu]_ - [#fsf]_. +- The `Free Software Foundation (FSF) <#fsf_>`_ promotes the use of + SPDX license identifiers for clarity in the `GPL <#gnu_>`_ and other + versioned free software licenses. -- The Free Software Foundation Europe (FSFE) REUSE project [#reuse]_ promotes - using ``SPDX-License-Identifier``. +- The Free Software Foundation Europe (FSFE) `REUSE project <#reuse_>`_ + promotes using ``SPDX-License-Identifier``. -- The Linux kernel uses ``SPDX-License-Identifier`` and parts of the FSFE REUSE - conventions to document its licenses [#linux]_. +- The `Linux kernel <#linux_>`_ uses ``SPDX-License-Identifier`` + and parts of the FSFE REUSE conventions to document its licenses. -- U-Boot spearheaded using ``SPDX-License-Identifier`` in code and now follows the - Linux ways [#uboot]_. +- `U-Boot <#uboot_>`_ spearheaded using ``SPDX-License-Identifier`` in code + and now follows the Linux approach. -- The Apache Software Foundation projects use RDF DOAP [#apache]_ with a single - license field pointing to SPDX license identifiers. +- The Apache Software Foundation projects use `RDF DOAP <#apache_>`_ with + a single license field pointing to SPDX license identifiers. -- The Eclipse Foundation promotes using ``SPDX-license-Identifiers`` [#eclipse]_ +- The `Eclipse Foundation <#eclipse_>`_ promotes using + ``SPDX-license-Identifiers``. -- The ClearlyDefined project [#cd]_ promotes using SPDX license identifiers and - expressions to improve license clarity. +- The `ClearlyDefined project <#clearlydefined_>`_ promotes using SPDX + license identifiers and expressions to improve license clarity. -- The Android Open Source Project [#android]_ use ``MODULE_LICENSE_XXX`` empty - tag files where ``XXX`` is a license code such as BSD, APACHE, GPL, etc. And - side by side with this ``MODULE_LICENSE`` file there is a ``NOTICE`` file - that contains license and notices texts. +- The `Android Open Source Project <#android_>`_ uses ``MODULE_LICENSE_XXX`` + empty tag files, where ``XXX`` is a license code such as ``BSD``, ``APACHE``, + ``GPL``, etc. It also uses a ``NOTICE`` file that contains license and + notice texts. References ========== -This document specifies version 2.2 of the metadata format. - -- Version 1.0 is specified in PEP 241. -- Version 1.1 is specified in PEP 314. -- Version 1.2 is specified in PEP 345. -- Version 2.0, while not formally accepted, was specified in PEP 426. -- Version 2.1 is specified in PEP 566. - -.. [#cms] https://packaging.python.org/specifications/core-metadata -.. [#cdstats] https://clearlydefined.io/stats -.. [#cd] https://clearlydefined.io -.. [#osi] http://opensource.org -.. [#classif] https://pypi.org/classifiers -.. [#spdxlist] https://spdx.org/licenses -.. [#spdx] https://spdx.org -.. [#spdx22] https://spdx.github.io/spdx-spec/appendix-IV-SPDX-license-expressions/ -.. [#wheels] https://github.com/pypa/wheel/blob/b8b21a5720df98703716d3cd981d8886393228fa/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file -.. [#reuse] https://reuse.software/ -.. [#licexp] https://github.com/nexB/license-expression/ -.. [#spdxpy] https://github.com/spdx/tools-python/ -.. [#scancodetk] https://github.com/nexB/scancode-toolkit -.. [#licfield] https://packaging.python.org/guides/distributing-packages-using-setuptools/?highlight=MANIFEST.in#license -.. [#samplesetup] https://github.com/pypa/sampleproject/blob/52966defd6a61e97295b0bb82cd3474ac3e11c7a/setup.py#L98 -.. [#pipsetup] https://github.com/pypa/pip/blob/476606425a08c66b9c9d326994ff5cf3f770926a/setup.cfg#L40 -.. [#setuptoolssdist] https://github.com/pypa/setuptools/blob/97e8ad4f5ff7793729e9c8be38e0901e3ad8d09e/setuptools/command/sdist.py#L202 -.. [#packaging] https://packaging.python.org/guides/distributing-packages-using-setuptools/?highlight=MANIFEST.in#license-txt -.. [#pycode] https://github.com/search?l=Python&q=%22__license__%22&type=Code -.. [#setuptools5030] https://github.com/pypa/setuptools/blob/v50.3.0/setup.cfg#L17 -.. [#packlic] https://github.com/pypa/packaging/blob/19.1/LICENSE -.. [#conda] https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#about-section -.. [#flit] https://github.com/takluyver/flit -.. [#poetry] https://poetry.eustace.io/docs/pyproject/#license -.. [#pbr] https://docs.openstack.org/pbr/latest/user/features.html -.. [#dep5] https://dep-team.pages.debian.net/deps/dep5/ -.. [#fedora] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/ -.. [#fedoratext] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_license_text -.. [#fedoralic] https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_valid_license_short_names -.. [#fedoralist] https://fedoraproject.org/wiki/Licensing:Main?rd=Licensing#Good_Licenses -.. [#opensuse] https://en.opensuse.org/openSUSE:Packaging_guidelines#Licensing -.. [#opensuselist] https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub -.. [#gentoo] https://devmanual.gentoo.org/ebuild-writing/variables/index.html#license -.. [#glep23] https://www.gentoo.org/glep/glep-0023.html -.. [#gentoodev] https://devmanual.gentoo.org/general-concepts/licenses/index.html -.. [#freebsd] https://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/licenses.html -.. [#archinux] https://wiki.archlinux.org/index.php/PKGBUILD#license -.. [#archlinuxlist] https://wiki.archlinux.org/index.php/PKGBUILD#license -.. [#openwrt] https://openwrt.org/docs/guide-developer/packages#buildpackage_variables -.. [#nixos] https://github.com/NixOS/nixpkgs/blob/master/lib/licenses.nix -.. [#guix] http://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm -.. [#guixlic] https://guix.gnu.org/manual/en/html_node/package-Reference.html#index-license_002c-of-packages -.. [#alpine] https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package#license -.. [#maven] https://maven.apache.org/pom.html#Licenses -.. [#npm] https://docs.npmjs.com/files/package.json#license -.. [#gem] https://guides.rubygems.org/specification-reference/#license= -.. [#perl] https://metacpan.org/pod/CPAN::Meta::Spec#license -.. [#cargo] https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata -.. [#cratesio] https://doc.rust-lang.org/cargo/reference/registries.html#publish -.. [#composer] https://getcomposer.org/doc/04-schema.md#license -.. [#nuget] https://docs.microsoft.com/en-us/nuget/reference/nuspec#licenseurl -.. [#flutter] https://flutter.dev/docs/development/packages-and-plugins/developing-packages#adding-licenses-to-the-license-file -.. [#bower] https://github.com/bower/spec/blob/master/json.md#license -.. [#cocoapod] https://guides.cocoapods.org/syntax/podspec.html#license -.. [#cabal] https://cabal.readthedocs.io/en/latest/developing-packages.html#pkg-field-license -.. [#mix] https://hex.pm/docs/publish -.. [#dub] https://dub.pm/package-format-json.html#licenses -.. [#cran] https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing -.. [#spdxids] https://spdx.org/using-spdx-license-identifier -.. [#gnu] https://www.gnu.org/licenses/identify-licenses-clearly.html -.. [#fsf] https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2 -.. [#linux] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst -.. [#uboot] https://www.denx.de/wiki/U-Boot/Licensing -.. [#apache] https://svn.apache.org/repos/asf/allura/doap_Allura.rdf -.. [#eclipse] https://www.eclipse.org/legal/epl-2.0/faq.php -.. [#android] https://github.com/aosp-mirror/platform_external_tcpdump/blob/master/MODULE_LICENSE_BSD -.. [#cc0] https://creativecommons.org/publicdomain/zero/1.0/ -.. [#unlic] https://unlicense.org/ - - -Copyright -========= - -This document is placed in the public domain or under the CC0-1.0-Universal -license [#cc0]_, whichever is more permissive. - - -Acknowledgements -================ +.. _#alpine: https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package#license +.. _#android: https://github.com/aosp-mirror/platform_external_tcpdump/blob/android-platform-12.0.0_r1/MODULE_LICENSE_BSD +.. _#apache: https://svn.apache.org/repos/asf/allura/doap_Allura.rdf +.. _#archinux: https://wiki.archlinux.org/title/PKGBUILD#license +.. _#archlinuxlist: https://archlinux.org/packages/core/any/licenses/files/ +.. _#badclassifiers: https://github.com/pypa/trove-classifiers/issues/17#issuecomment-385027197 +.. _#bower: https://github.com/bower/spec/blob/b00c4403e22e3f6177c410ed3391b9259687e461/json.md#license +.. _#cabal: https://cabal.readthedocs.io/en/3.6/cabal-package.html?highlight=license#pkg-field-license +.. _#cargo: https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata +.. _#cc0: https://creativecommons.org/publicdomain/zero/1.0/ +.. _#cdstats: https://clearlydefined.io/stats +.. _#choosealicense: https://choosealicense.com/ +.. _#choosealicenselist: https://choosealicense.com/licenses/ +.. _#chooseamitlicense: https://choosealicense.com/licenses/mit/ +.. _#classifierissue: https://github.com/pypa/trove-classifiers/issues/17 +.. _#classifiers: https://pypi.org/classifiers +.. _#classifiersrepo: https://github.com/pypa/trove-classifiers +.. _#clearlydefined: https://clearlydefined.io +.. _#cocoapod: https://guides.cocoapods.org/syntax/podspec.html#license +.. _#composer: https://getcomposer.org/doc/04-schema.md#license +.. _#conda: https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#about-section +.. _#coremetadataspec: https://packaging.python.org/specifications/core-metadata +.. _#cran: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing +.. _#cratesio: https://doc.rust-lang.org/cargo/reference/registries.html#publish +.. _#dep5: https://dep-team.pages.debian.net/deps/dep5/ +.. _#dontchoosealicense: https://choosealicense.com/no-permission/ +.. _#dub: https://dub.pm/package-format-json.html#licenses +.. _#eclipse: https://www.eclipse.org/legal/epl-2.0/faq.php +.. _#fedora: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/ +.. _#fedoralicense: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_valid_license_short_names +.. _#fedoralist: https://fedoraproject.org/wiki/Licensing:Main?rd=Licensing#Good_Licenses +.. _#fedoratext: https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_license_text +.. _#flit: https://flit.readthedocs.io/en/stable/pyproject_toml.html +.. _#flutter: https://flutter.dev/docs/development/packages-and-plugins/developing-packages#adding-licenses-to-the-license-file +.. _#freebsd: https://docs.freebsd.org/en/books/porters-handbook/makefiles/#licenses +.. _#fsf: https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2 +.. _#gem: https://guides.rubygems.org/specification-reference/#license= +.. _#gentoo: https://devmanual.gentoo.org/ebuild-writing/variables/index.html#license +.. _#gentoodev: https://devmanual.gentoo.org/general-concepts/licenses/index.html +.. _#glep23: https://www.gentoo.org/glep/glep-0023.html +.. _#globmodule: https://docs.python.org/3/library/glob.html +.. _#gnu: https://www.gnu.org/licenses/identify-licenses-clearly.html +.. _#guix: https://git.savannah.gnu.org/cgit/guix.git/tree/guix/licenses.scm?h=v1.3.0 +.. _#guixlicense: https://guix.gnu.org/manual/en/html_node/package-Reference.html#index-license_002c-of-packages +.. _#installedspec: https://packaging.python.org/specifications/recording-installed-packages/ +.. _#interopissue: https://github.com/pypa/interoperability-peps/issues/46 +.. _#licenseexplib: https://github.com/nexB/license-expression/ +.. _#licensefield: https://packaging.python.org/guides/distributing-packages-using-setuptools/#license +.. _#linux: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst +.. _#maven: https://maven.apache.org/pom.html#Licenses +.. _#mitlicense: https://opensource.org/licenses/MIT +.. _#mix: https://hex.pm/docs/publish +.. _#nixos: https://github.com/NixOS/nixpkgs/blob/21.05/lib/licenses.nix +.. _#npm: https://docs.npmjs.com/cli/v8/configuring-npm/package-json#license +.. _#nuget: https://docs.microsoft.com/en-us/nuget/reference/nuspec#licenseurl +.. _#numpyissue: https://github.com/numpy/numpy/issues/8689 +.. _#opensuse: https://en.opensuse.org/openSUSE:Packaging_guidelines#Licensing +.. _#opensuselist: https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub +.. _#openwrt: https://openwrt.org/docs/guide-developer/packages#buildpackage_variables +.. _#osi: https://opensource.org +.. _#packagingguidetxt: https://packaging.python.org/guides/distributing-packages-using-setuptools/#license-txt +.. _#packagingissue: https://github.com/pypa/packaging-problems/issues/41 +.. _#packaginglicense: https://github.com/pypa/packaging/blob/21.2/LICENSE +.. _#packagingtutkey: https://packaging.python.org/tutorials/packaging-projects/#configuring-metadata +.. _#packagingtuttxt: https://packaging.python.org/tutorials/packaging-projects/#creating-a-license +.. _#pbr: https://docs.openstack.org/pbr/latest/user/features.html +.. _#pep621spec: https://packaging.python.org/specifications/declaring-project-metadata/ +.. _#pepissue: https://github.com/pombredanne/spdx-pypi-pep/issues/1 +.. _#perl: https://metacpan.org/pod/CPAN::Meta::Spec#license +.. _#pipsetup: https://github.com/pypa/pip/blob/21.3.1/setup.cfg#L114 +.. _#poetry: https://python-poetry.org/docs/pyproject/#license +.. _#pycode: https://github.com/search?l=Python&q=%22__license__%22&type=Code +.. _#pypi: https://pypi.org/ +.. _#pypugglossary: https://packaging.python.org/glossary/ +.. _#pytorch: https://pypi.org/project/torch/ +.. _#reuse: https://reuse.software/ +.. _#reusediscussion: https://github.com/pombredanne/spdx-pypi-pep/issues/7 +.. _#samplesetupcfg: https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.cfg +.. _#samplesetuppy: https://github.com/pypa/sampleproject/blob/3a836905fbd687af334db16b16c37cf51dcbc99c/setup.py#L98 +.. _#scancodetk: https://github.com/nexB/scancode-toolkit +.. _#scipyissue: https://github.com/scipy/scipy/issues/7093 +.. _#sdistspec: https://packaging.python.org/specifications/source-distribution-format/ +.. _#setuptools5911: https://github.com/pypa/setuptools/blob/v59.1.1/setup.cfg +.. _#setuptoolsfiles: https://github.com/pypa/setuptools/issues/2739 +.. _#setuptoolspep639: https://github.com/pypa/setuptools/pull/2645 +.. _#setuptoolssdist: https://github.com/pypa/setuptools/pull/1767 +.. _#spdx: https://spdx.dev/ +.. _#spdxid: https://spdx.dev/ids/ +.. _#spdxlist: https://spdx.org/licenses/ +.. _#spdxpression: https://spdx.github.io/spdx-spec/SPDX-license-expressions/ +.. _#spdxpy: https://github.com/spdx/tools-python/ +.. _#spdxtutorial: https://github.com/david-a-wheeler/spdx-tutorial +.. _#spdxversion: https://github.com/pombredanne/spdx-pypi-pep/issues/6 +.. _#uboot: https://www.denx.de/wiki/U-Boot/Licensing +.. _#unlicense: https://unlicense.org/ +.. _#wheelfiles: https://github.com/pypa/wheel/issues/138 +.. _#wheelproject: https://wheel.readthedocs.io/en/stable/ +.. _#wheels: https://github.com/pypa/wheel/blob/0.37.0/docs/user_guide.rst#including-license-files-in-the-generated-wheel-file +.. _#wheelspec: https://packaging.python.org/specifications/binary-distribution-format/ + + +Acknowledgments +=============== - Nick Coghlan - Kevin P. Fleming @@ -894,6 +2709,13 @@ Acknowledgements - Luis Villa +Copyright +========= + +This document is placed in the public domain or under the +`CC0-1.0-Universal license <#cc0_>`_, whichever is more permissive. + + .. Local Variables: