-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for RFC 2196, "metabuild: semantic build scripts for Cargo" #14903
Comments
I wanted to register a concern on this RFC but I didn't realize how quickly it was approved. I know that the goal of moving forward here is to make it the manner in which cargo processes native dependencies more declarative and easier to process by other build systems. I 100% approve of that goal. I'd love to see a future where the difference between a dependency implemented in C and one implemented in Rust was essentially insignificant to the end user. The RFC states:
As a nightly-only means of experimenting toward finding a long term solution to native dependencies, I am totally behind this RFC. In contrast, I feel a lot of concern about providing "metabuild" in this form as a stable feature because of the other ways this feature can be used. I find the idea of declaratively listing crates in Without making any sort of "slippery slope" analogy, I want to share a frustrating experience I had with a Ruby on Rails project because of the multiple layers of opaque "declarative" build/exec processing that have developed in that ecosystem. The It took me quite a while to figure out how Eventually, I discovered that the 2 seconds were because I was using In other words, when I ran To recap, I want to draw a clear distinction between building native dependencies and arbitrary build-time processing. I think its completely correct for the first to be handled declaratively, even implicitly. But when it comes to executing arbitrary code at build time to do anything at all, I think it is important that it be obvious and discoverable what additional behavior is being run at build time. The build script solves this by having literal source code you can read. But having to spelunk into other repositories (if there are even repositories linked from crates.io for your metabuild dependencies) is a real step back in this regard. |
@withoutboats First, I do want to emphasize that the goal is to experiment in the Cargo ecosystem, not to immediately stabilize it. That was what ultimately led to moving forward with the RFC: the desire to enable that experimentation and development. I do understand the concern about builds becoming more opaque. On the other hand, if you see a metabuild key pulling in So I do want to see every component of I don't think this obscures the build pipeline or makes it less discoverable, any more than having other functionality factored out into crates obscures the code using those crates. |
To make builds truly reproducible and remove the sorts of issues @withoutboats experienced, one needs a reproducible build chain that is completely independent of the binaries on the host. That requires versioning of even the smallest build dependency - the version of an autoconf m4 macro (not the generated configure) or a hardcoded reference to For a more general build system, features make it even harder. Take the DPDK project or rdma-core library - they have some many ways of building them there's no sane way to abstract in a way that sorts more than a very narrow subset of uses. |
@raphaelcohn I don't see the connection between reproducibility and the issue I was talking about - discoverability. |
"dotfiles in my home directory". Something which is not reproducible is not easily discoverable. |
Is anyone working on this? I have some free time and am willing to help. |
@ehuss not that I know of; it'd be great to see some action here! cc @joshtriplett |
This is now available on nightly (documentation here). Some things that probably should be decided before stabilizing:
|
Has anybody tried using metabuild? Does anybody have any feedback for us? |
One result for I really want something like metabuild, but build.rs scripts aren't viable for a lot of my needs - and by extension neither is metabuild. What I need is more along the lines of a I'm not sure if this feedback is in-scope for metabuild itself, but I'll lay out some example scenarios, none of which seem well covered. I've been handling them with non-portable windows .cmd scripts up until now, which is terrible. First, the TL;DR version:
And then some of the concrete examples:
|
@MaulingMonkey I think this indeed goes well beyond the scope of the metabuild. I think what you want here are cargo workflows, or cargo tasks. As far as I know, they didn't get past the vague ideas stage.
|
I'd like to give some initial feedback on this: I wasn't aware of this feature until now that I was going to propose almost the same thing. I think it may lack visibility in the community which is also the reason there are so few experiments. Now that I know I intend to implement It will be trivial for me because I already use a pattern similar to what is proposed here - basically just call Considering that I already do something that's almost identical to what's proposed here (and that being the very reason why I wanted to propose ~same thing), I believe my feedback is worth taking into account even without actually already using There was a concern in the RFC PR that parsing cargo manifest is too complicated. From my experience it was completely fine. I do agree that serde increases compile time but I needed it anyway and I believe this would be better solved by being able to cache built metabuild binaries. It should even be possible to cache them across rust versions (but there are crates that detect versions, more on that later). If there's desire to decrease friction of this maybe instead of calling Regarding discoverability, I don't see a difference between declaring a dependency and having a A quick experiment now showed that this feature, as-is, is not backwards-compatible - older versions of Cargo will reject the There's one more reason I wanted to propose same feature: separating library-build scripts from codegen scripts. A real-world example is
However this is broken in practice because if you override build script for I also guess that adding a dependency to both A way to kill three (!) birds with one stone is to instead of adding the key to There was an argument that build scripts can not be turned into declarative because of many quirks. Interestingly it may be an argument for this feature if categorization is implemented: one would get the library from system packages and only run codegen which can be declarative. Maybe In the future I could imagine being able to pre-install binaries from trusted sources (or compile them once myself) and then instruct Finally, and this is likely orthogonal, I'd like to have a way of specifying that certain codegen script generates code that uses a specific library as a dependency - this is the case in |
Report from testing:
Other than that it seem to work fine. :) |
I've found one problem with dependency custom names. cargo-features = ["metabuild"]
[package]
metabuild = ["foo-bar"]
[build-dependencies.foo-bar]
path = "../build"
package = "my-build" So this way we will not get any error/warning and |
A relevant PEP: https://peps.python.org/pep-0725/ |
@matklad I don't see how, it talks about specifying dependencies, not running them. |
Since this RFC was created... We now have bindeps / artifact deps and I feel that pull in metabuild dependencies as artifact deps, rather than rlibs, would be better as it allows a shared build script to be compiled once. Yes, the rlib gets compiled once but we then still have to have the wrapper pull it in and that can be significant, depending on the profile settings. You can now We have a request for being able to run unit tests (#9942). If build scripts were dedicated packages in a workspace, then they would get that "for free". Granted, this then requires publishing the build script package. There is more scrutiny of supply chains and auditing of code that gets run during builds (e.g. #5720 is talked about frequently in the community). This allowing us to only audit a common set of shared build scripts and not even wrapper scripts in user programs would be a big help. There is a continued emphasis on build times and I would be concerned about this needing a toml parser, serde, and enough of the manifest schema to make this work. That could very impede this proposal being adopted in core crates. Yes, we could do intermediate solutions like pulling out |
A counter proposal to the current metabuild design, broken out into milestones Step 1: metabuild polyfillExtend
[package]
build = ["build-foo.rs", "build-bar.rs"] This would allow some experimentation with this approach without fully defining the interface Note: no auto-discovery of multiple build scripts is being considered because the end-goal is to shift the focus to metabuild Step 2: metabuild polyfill argumentsExtend [[package.build]]
path = "build-foo.rs"
args = ["--foo=bar"]
env = { FOO = "bar" } The focus is on providing a low-level mechanism that users can do what they want with without much compile-time overhead. Ideally, args would be parsed with a CLI parser which can pull in some bulk. Small parsers like This would allow more build scripts to experiment with this approach Step 3: metabuildBlocked on artifact dependencies Add a
[[package.build]]
dependency = "foo"
args = ["--foo=bar"]
env = { FOO = "bar" } If the full table is not needed, maybe have a EvaluationBenefits
Downsides
|
I love the concept of this. Steps 1 and 3 together seem like a great design. For step 2, I'm wondering if we need both env and args, particularly if cargo already passes all the environment variables it normally does to a build script. Could we drop args and the fully general env support, and instead define a narrower config mechanism that we pass through via either the command line or the environment (one or the other, not both)? For instance: [[package.build]]
dependency = "foo"
config = { foo = "bar" } We could either pass this through the environment as We could, alternatively, do the same thing for passing in the contents of (some subset of) the On balance I think we should include it under |
I had both As for
I was hoping by going with For benefits to a higher level config, I saw little. I didn't see basic validation getting users much since the build script will validate anyways. However, in thinking more on it, what we can get is
The main cost I thought of was users dealing with the translation of the config to env variables (prefixes, case conversion). However, if we are to get those benefits I named, we also need to define a set of parameters to a build script. We'd need a name for the table that has a clear role separate from |
I would push back on using env variables to pass configs: They are a global namespace, they make debugging failed commands much more difficult, and the experience if you mistype a name somewhere isn't good: often the command succeeds but silently does the wrong thing. I know you want to do some light validation in cargo itself, but it seems desirable to push that down into the script for richer error reporting, with the added benefit that it makes debugging outside of cargo easier. We already have env variables that are global to the build/package, but for individual commands it makes less sense IMO. As a start I would suggest that configs can be scalars that are translated into command line arguments using standard |
If we have basic validation, I don't see how env variables are any worse than CLI arguments. I would expect the validation from Cargo to be nearly as rich as what an optimum CLI can provide and can provide the feedback much earlier (before kicking off any builds). The "delegate to a CLI" approach is used with lib I suspect build times (and annoyingly "dependency counts") can make or break this feature and using a CLI could break this. I'd like to see this used for nearly all build scripts and a lot of people, especially more core crates, are sensitive to these types of things. In practice, it would help if they all used the same CLI parser but people rarely look at build time costs of dependencies in aggregate ("X pulls in |
Oh, yes please, metabuild using artifact dependencies makes so much sense! I think it'd even make sense to have a common cache of artifact dependencies since presumably tools don't need to change that often. Regarding configuration I think env is fine and whoever needs something more structured can use One thing though, I think the tools should be categorized so that some can be skipped. E.g. codegen tools are required and would always run, link-info-providers would only run during linking and only if the information wasn't provided in other way (today this disables the entire build script which nukes bindgen), doc generators possibly shouldn't run by default. |
In the short term, this proposal would allow skipping through features. In the longer term, by changing each build script into a table, we could add more capabilities like this in the future. I'd rather not commit to it now so we can focus on these more concrete steps and get it merged rather than getting side tracked with researching mew build script concepts. |
The issue with features is that if one uses them for enabling/disabling link info they have to stay to not break things. I'd very much like if at least the link info thing was possible from the beginning because there's ton of crates doing FFI and also using bindgen. |
Yes, it wouldn't work in a case like that but for things like doc gen. |
This is a tracking issue for the RFC "metabuild: semantic build scripts for Cargo" (rust-lang/rfcs#2196).
Steps:
Unresolved questions:
None
The text was updated successfully, but these errors were encountered: