Skip to content

Commit

Permalink
Build backend: Revamp include/exclude
Browse files Browse the repository at this point in the history
When building the source distribution, we always need to include `pyproject.toml` and the module, when building the wheel, we always include the module but nothing else at top level. Since we only allow a single module per wheel, that means that there are no specific wheel includes. This means we have source includes, source excludes, wheel excludes, but no wheel includes: This is defined by the module root, plus the metadata files and data directories separately.

Extra source dist includes are currently unused (they can't end up in the wheel currently), but it makes sense to model them here, they will be needed for any sort of procedural build step.

This results in the following fields being relevant for inclusions and exclusion:

* project.readme: PEP 621
* project.license-files: PEP 639
* module_root: `Path`
* source_include: `Vec<Glob>`
* source_exclude: `Vec<Glob>`
* wheel_exclude: `Vec<Glob>`
* data: `Map<KnownDataName, PathBuf>`

An opinionated choice is that that wheel excludes always contain the source excludes: Otherwise you could have a path A in the source tree that gets included when building the wheel directly from the source tree, but not when going through the source dist as intermediary, because A is in source excludes, but not in the wheel excludes. This has been a source of errors previously.

In the process, i fixed a bug where we would skip directories and only include the files and were missing license due to absolute globs.
  • Loading branch information
konstin committed Nov 29, 2024
1 parent b9b37a9 commit c777459
Show file tree
Hide file tree
Showing 11 changed files with 272 additions and 207 deletions.
308 changes: 146 additions & 162 deletions crates/uv-build-backend/src/lib.rs

Large diffs are not rendered by default.

131 changes: 97 additions & 34 deletions crates/uv-build-backend/src/metadata.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
use crate::Error;
use globset::Glob;
use itertools::Itertools;
use serde::Deserialize;
use std::collections::{BTreeMap, Bound};
Expand All @@ -17,6 +16,9 @@ use uv_warnings::warn_user_once;
use version_ranges::Ranges;
use walkdir::WalkDir;

/// By default, we ignore generated python files.
const DEFAULT_EXCLUDES: &[&str] = &["__pycache__", "*.pyc", "*.pyo"];

#[derive(Debug, Error)]
pub enum ValidationError {
/// The spec isn't clear about what the values in that field would be, and we only support the
Expand Down Expand Up @@ -86,8 +88,8 @@ impl PyProjectToml {
self.project.license_files.as_deref()
}

pub(crate) fn wheel_settings(&self) -> Option<&WheelSettings> {
self.tool.as_ref()?.uv.as_ref()?.wheel.as_ref()
pub(crate) fn settings(&self) -> Option<&BuildBackendSettings> {
self.tool.as_ref()?.uv.as_ref()?.build_backend.as_ref()
}

/// Warn if the `[build-system]` table looks suspicious.
Expand Down Expand Up @@ -335,23 +337,12 @@ impl PyProjectToml {
field: license_glob.to_string(),
source: err,
})?;
let absolute_glob = PathBuf::from(globset::escape(
root.simplified().to_string_lossy().as_ref(),
))
.join(pep639_glob.to_string())
.to_string_lossy()
.to_string();
license_globs_parsed.push(Glob::new(&absolute_glob).map_err(|err| {
Error::GlobSet {
field: "project.license-files".to_string(),
err,
}
})?);
license_globs_parsed.push(pep639_glob);
}
let license_globs =
GlobDirFilter::from_globs(&license_globs_parsed).map_err(|err| {
Error::GlobSetTooLarge {
field: "tool.uv.source-dist.include".to_string(),
field: "tool.uv.build-backend.source-include".to_string(),
source: err,
}
})?;
Expand All @@ -365,7 +356,7 @@ impl PyProjectToml {
)
}) {
let entry = entry.map_err(|err| Error::WalkDir {
root: PathBuf::from("."),
root: root.to_path_buf(),
err,
})?;
let relative = entry
Expand All @@ -376,13 +367,18 @@ impl PyProjectToml {
trace!("Not a license files match: `{}`", relative.user_display());
continue;
}
if !entry.file_type().is_file() {
trace!(
"Not a file in license files match: `{}`",
relative.user_display()
);
continue;
}

debug!("License files match: `{}`", relative.user_display());
let license_file = relative.to_string_lossy().to_string();

if !license_files.contains(&license_file) {
license_files.push(license_file);
}
license_files.push(license_file);
}

// The glob order may be unstable
Expand Down Expand Up @@ -707,33 +703,100 @@ pub(crate) struct Tool {
#[derive(Deserialize, Debug, Clone)]
#[serde(rename_all = "kebab-case")]
pub(crate) struct ToolUv {
/// Configuration for building source dists with the uv build backend
#[allow(dead_code)]
source_dist: Option<serde::de::IgnoredAny>,
/// Configuration for building wheels with the uv build backend
wheel: Option<WheelSettings>,
/// Configuration for building source distributions and wheels with the uv build backend
build_backend: Option<BuildBackendSettings>,
}

/// The `tool.uv.wheel` section with wheel build configuration.
/// To select which files to include in the source distribution, we first add the includes, then
/// remove the excludes from that.
///
/// When building the source distribution, the following files and directories are included:
/// * `pyproject.toml`
/// * The module under `tool.uv.build-backend.module-root`, by default
/// `src/<project_name_with_underscores>/**`.
/// * `project.license-files` and `project.readme`.
/// * All directories under `tool.uv.build-backend.data`.
/// * All patterns from `tool.uv.build-backend.source-include`.
///
/// From these, we remove the `tool.uv.build-backend.source-exclude` matches.
///
/// When building the wheel, the following files and directories are included:
/// * The module under `tool.uv.build-backend.module-root`, by default
/// `src/<project_name_with_underscores>/**`.
/// * `project.license-files` and `project.readme`, as part of the project metadata.
/// * Each directory under `tool.uv.build-backend.data`, as data directories.
///
/// From these, we remove the `tool.uv.build-backend.source-exclude` and
/// `tool.uv.build-backend.wheel-exclude` matches. The source dist excludes are applied to avoid
/// source tree -> wheel source including more files than
/// source tree -> source distribution -> wheel.
///
/// There are no specific wheel includes. There must only be one top level module, and all data
/// files must either be under the module root or in a data directory. Most packages store small
/// data in the module root alongside the source code.
#[derive(Deserialize, Debug, Clone)]
#[serde(rename_all = "kebab-case")]
pub(crate) struct WheelSettings {
#[serde(default, rename_all = "kebab-case")]
pub(crate) struct BuildBackendSettings {
/// The directory that contains the module directory, usually `src`, or an empty path when
/// using the flat layout over the src layout.
pub(crate) module_root: Option<PathBuf>,
pub(crate) module_root: PathBuf,

/// Glob expressions which files and directories to additionally include in the source
/// distribution.
///
/// `pyproject.toml` and the contents of the module directory are always included.
///
/// Includes are anchored, which means that `pyproject.toml` includes only
/// `<project root>/pyproject.toml`. Use for example `assets/**/sample.csv` to include for all
/// `sample.csv` files in `<project root>/assets` or any child directory. To recursively include
/// all files under a directory, use a `/**` suffix, e.g. `src/**`. For performance and
/// reproducibility, avoid unanchored matches such as `**/sample.csv`.
///
/// The glob syntax is the reduced portable glob from
/// [PEP 639](https://peps.python.org/pep-0639/#add-license-FILES-key).
pub(crate) source_include: Vec<String>,

/// Glob expressions which files and directories to exclude from the previous source
/// distribution includes.
/// Glob expressions which files and directories to exclude from the source distribution.
///
/// Default: `__pycache__`, `*.pyc`, and `*.pyo`.
///
/// Excludes are not anchored, which means that `__pycache__` excludes all directories named
/// `__pycache__` and it's children anywhere. To anchor a directory, use a `/` prefix, e.g.,
/// `/dist` will exclude only `<project root>/dist`.
///
/// The glob syntax is the reduced portable glob from
/// [PEP 639](https://peps.python.org/pep-0639/#add-license-FILES-key).
pub(crate) source_exclude: Vec<String>,

/// Glob expressions which files and directories to exclude from the wheel.
///
/// Default: `__pycache__`, `*.pyc`, and `*.pyo`.
///
/// Excludes are not anchored, which means that `__pycache__` excludes all directories named
/// `__pycache__` and it's children anywhere. To anchor a directory, use a `/` prefix, e.g.,
/// `/dist` will exclude only `<project root>/dist`.
///
/// The glob syntax is the reduced portable glob from
/// [PEP 639](https://peps.python.org/pep-0639/#add-license-FILES-key).
pub(crate) exclude: Option<Vec<String>>,
pub(crate) wheel_exclude: Vec<String>,

/// Data includes for wheels.
pub(crate) data: Option<WheelDataIncludes>,
///
/// The directories included here are also included in the source distribution. They are copied
/// to the right wheel subdirectory on build.
pub(crate) data: WheelDataIncludes,
}

impl Default for BuildBackendSettings {
fn default() -> Self {
Self {
module_root: PathBuf::from("src"),
source_include: Vec::new(),
source_exclude: DEFAULT_EXCLUDES.iter().map(ToString::to_string).collect(),
wheel_exclude: DEFAULT_EXCLUDES.iter().map(ToString::to_string).collect(),
data: WheelDataIncludes::default(),
}
}
}

/// Data includes for wheels.
Expand All @@ -754,7 +817,7 @@ pub(crate) struct WheelSettings {
/// uses these two options.
#[derive(Default, Deserialize, Debug, Clone)]
// `deny_unknown_fields` to catch typos such as `header` vs `headers`.
#[serde(rename_all = "kebab-case", deny_unknown_fields)]
#[serde(default, rename_all = "kebab-case", deny_unknown_fields)]
pub(crate) struct WheelDataIncludes {
purelib: Option<String>,
platlib: Option<String>,
Expand Down
11 changes: 10 additions & 1 deletion crates/uv-globfilter/src/glob_dir_filter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,10 @@ impl GlobDirFilter {
}

/// Whether the path (file or directory) matches any of the globs.
///
/// We include a directory if we are potentially including files it contains.
pub fn match_path(&self, path: &Path) -> bool {
self.glob_set.is_match(path)
self.match_directory(path) || self.glob_set.is_match(path)
}

/// Check whether a directory or any of its children can be matched by any of the globs.
Expand Down Expand Up @@ -261,9 +263,16 @@ mod tests {
assert_eq!(
matches,
[
"",
"path1",
"path1/dir1",
"path2",
"path2/dir2",
"path3",
"path3/dir3",
"path3/dir3/subdir",
"path3/dir3/subdir/a.txt",
"path4",
"path4/dir4",
"path4/dir4/subdir",
"path4/dir4/subdir/a.txt",
Expand Down
7 changes: 2 additions & 5 deletions crates/uv-settings/src/settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1646,9 +1646,7 @@ pub struct OptionsWire {

// Build backend
#[allow(dead_code)]
source_dist: Option<serde::de::IgnoredAny>,
#[allow(dead_code)]
wheel: Option<serde::de::IgnoredAny>,
build_backend: Option<serde::de::IgnoredAny>,
}

impl From<OptionsWire> for Options {
Expand Down Expand Up @@ -1707,8 +1705,7 @@ impl From<OptionsWire> for Options {
managed,
package,
// Used by the build backend
source_dist: _,
wheel: _,
build_backend: _,
} = value;

Self {
Expand Down
2 changes: 0 additions & 2 deletions crates/uv/src/commands/build_backend.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,12 @@ use anyhow::{Context, Result};
use std::env;
use std::io::Write;
use std::path::Path;
use uv_build_backend::SourceDistSettings;

/// PEP 517 hook to build a source distribution.
pub(crate) fn build_sdist(sdist_directory: &Path) -> Result<ExitStatus> {
let filename = uv_build_backend::build_source_dist(
&env::current_dir()?,
sdist_directory,
SourceDistSettings::default(),
uv_version::version(),
)?;
// Tell the build frontend about the name of the artifact we built
Expand Down
2 changes: 1 addition & 1 deletion crates/uv/tests/it/pip_install.rs
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ fn invalid_pyproject_toml_option_unknown_field() -> Result<()> {
|
2 | unknown = "field"
| ^^^^^^^
unknown field `unknown`, expected one of `native-tls`, `offline`, `no-cache`, `cache-dir`, `preview`, `python-preference`, `python-downloads`, `concurrent-downloads`, `concurrent-builds`, `concurrent-installs`, `index`, `index-url`, `extra-index-url`, `no-index`, `find-links`, `index-strategy`, `keyring-provider`, `allow-insecure-host`, `resolution`, `prerelease`, `dependency-metadata`, `config-settings`, `no-build-isolation`, `no-build-isolation-package`, `exclude-newer`, `link-mode`, `compile-bytecode`, `no-sources`, `upgrade`, `upgrade-package`, `reinstall`, `reinstall-package`, `no-build`, `no-build-package`, `no-binary`, `no-binary-package`, `python-install-mirror`, `pypy-install-mirror`, `publish-url`, `trusted-publishing`, `pip`, `cache-keys`, `override-dependencies`, `constraint-dependencies`, `environments`, `conflicts`, `workspace`, `sources`, `managed`, `package`, `default-groups`, `dev-dependencies`, `source-dist`, `wheel`
unknown field `unknown`, expected one of `native-tls`, `offline`, `no-cache`, `cache-dir`, `preview`, `python-preference`, `python-downloads`, `concurrent-downloads`, `concurrent-builds`, `concurrent-installs`, `index`, `index-url`, `extra-index-url`, `no-index`, `find-links`, `index-strategy`, `keyring-provider`, `allow-insecure-host`, `resolution`, `prerelease`, `dependency-metadata`, `config-settings`, `no-build-isolation`, `no-build-isolation-package`, `exclude-newer`, `link-mode`, `compile-bytecode`, `no-sources`, `upgrade`, `upgrade-package`, `reinstall`, `reinstall-package`, `no-build`, `no-build-package`, `no-binary`, `no-binary-package`, `python-install-mirror`, `pypy-install-mirror`, `publish-url`, `trusted-publishing`, `pip`, `cache-keys`, `override-dependencies`, `constraint-dependencies`, `environments`, `conflicts`, `workspace`, `sources`, `managed`, `package`, `default-groups`, `dev-dependencies`, `build-backend`
Resolved in [TIME]
Audited in [TIME]
Expand Down
2 changes: 1 addition & 1 deletion crates/uv/tests/it/show_settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3443,7 +3443,7 @@ fn resolve_config_file() -> anyhow::Result<()> {
|
1 | [project]
| ^^^^^^^
unknown field `project`, expected one of `native-tls`, `offline`, `no-cache`, `cache-dir`, `preview`, `python-preference`, `python-downloads`, `concurrent-downloads`, `concurrent-builds`, `concurrent-installs`, `index`, `index-url`, `extra-index-url`, `no-index`, `find-links`, `index-strategy`, `keyring-provider`, `allow-insecure-host`, `resolution`, `prerelease`, `dependency-metadata`, `config-settings`, `no-build-isolation`, `no-build-isolation-package`, `exclude-newer`, `link-mode`, `compile-bytecode`, `no-sources`, `upgrade`, `upgrade-package`, `reinstall`, `reinstall-package`, `no-build`, `no-build-package`, `no-binary`, `no-binary-package`, `python-install-mirror`, `pypy-install-mirror`, `publish-url`, `trusted-publishing`, `pip`, `cache-keys`, `override-dependencies`, `constraint-dependencies`, `environments`, `conflicts`, `workspace`, `sources`, `managed`, `package`, `default-groups`, `dev-dependencies`, `source-dist`, `wheel`
unknown field `project`, expected one of `native-tls`, `offline`, `no-cache`, `cache-dir`, `preview`, `python-preference`, `python-downloads`, `concurrent-downloads`, `concurrent-builds`, `concurrent-installs`, `index`, `index-url`, `extra-index-url`, `no-index`, `find-links`, `index-strategy`, `keyring-provider`, `allow-insecure-host`, `resolution`, `prerelease`, `dependency-metadata`, `config-settings`, `no-build-isolation`, `no-build-isolation-package`, `exclude-newer`, `link-mode`, `compile-bytecode`, `no-sources`, `upgrade`, `upgrade-package`, `reinstall`, `reinstall-package`, `no-build`, `no-build-package`, `no-binary`, `no-binary-package`, `python-install-mirror`, `pypy-install-mirror`, `publish-url`, `trusted-publishing`, `pip`, `cache-keys`, `override-dependencies`, `constraint-dependencies`, `environments`, `conflicts`, `workspace`, `sources`, `managed`, `package`, `default-groups`, `dev-dependencies`, `build-backend`
"###
);

Expand Down
1 change: 1 addition & 0 deletions scripts/packages/built-by-uv/data-dir/build-script.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
print("Build script (currently unused)")
10 changes: 9 additions & 1 deletion scripts/packages/built-by-uv/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,15 @@ requires-python = ">=3.12"
dependencies = ["anyio>=4,<5"]
license-files = ["LICENSE*", "third-party-licenses/*"]

[tool.uv.wheel.data]
[tool.uv.build-backend]
# A file we need for the source dist -> wheel step, but not in the wheel itself (currently unused)
source-include = ["data/build-script.py"]
# A temporary or generated file we want to ignore
source-exclude = ["/src/built_by_uv/not-packaged.txt", "__pycache__", "*.pyc", "*.pyo"]
# Headers are build-only
wheel-exclude = ["build-*.h", "__pycache__", "*.pyc", "*.pyo"]

[tool.uv.build-backend.data]
scripts = "scripts"
data = "assets"
headers = "header"
Expand Down
4 changes: 4 additions & 0 deletions scripts/packages/built-by-uv/src/built_by_uv/build-only.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// There is no build step yet, but we're already modelling the basis for it by allowing files only in the source dist,
// but not in the wheel.

#include <pybind11/pybind11.h>
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This file should only exist locally.

0 comments on commit c777459

Please sign in to comment.