Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages can inherit fields from their root workspace #8664

Closed
wants to merge 18 commits into from

Conversation

yaymukund
Copy link
Contributor

@yaymukund yaymukund commented Aug 30, 2020

Tracking issue: #8415
RFC: rust-lang/rfcs#2906

This is not finished yet, but I think there's enough to review.
9/9: I've removed the draft status. There are still some items left, but I don't think they will change the high level design.
9/13: I think I am blocked on the remaining tasks.

  • Additional fields in [workspace]
    • Tests for additional fields
  • Support { workspace = true } for non-dependency fields
    • impl Serialize and Deserialize
    • resolve fields with WorkspaceRootConfig before they are referenced
    • add tests
    • (add some more tests 😅 )
    • resolve license-file, readme relative to the workspace
    • copy readme and license file into package root when packaging (Blocked, see inline comment)
  • Support { workspace = true } for dependencies
    • add tests for dependencies
  • Update Cargo.lock with [workspace.dependencies] (even if unreferenced) (Blocked, see Note 2 below)
  • Infer version directives
    • Tests
  • Nicer errors when parse_manifest() fails - normalize paths, smarter context
  • Tests to be added:
    • Path dependency's version/publish are defined in its workspace.
    • Copying readme and/or license file when packaging
      • Fail when a file with the same name already exists in the root.

Notes

  1. Thanks for the great writeup that made this approachable even for someone new to the repository like me, and to @yaahc for helping me think through the architecture ^_^
  2. read_manifest() needs to find the root workspace, which in turn needs to read_manifest(). To break this circular dependency, I added a ManifestCache which allows find_workspace_root() to do its job without parsing it all the way through. This is the biggest change in the PR.
  3. When parsing a manifest, we turn it into a DefinedTomlManifest which is used by the rest of the app. This is where the bulk of the interesting logic lives (see DefinedTomlManifest::from_toml_manifest())

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 30, 2020
src/cargo/util/toml/mod.rs Outdated Show resolved Hide resolved
@yaymukund yaymukund marked this pull request as draft August 30, 2020 07:57
tests/testsuite/build.rs Outdated Show resolved Hide resolved
@yaymukund yaymukund force-pushed the dedup-cargo-workspaces branch 2 times, most recently from 45b8db9 to b582eba Compare September 2, 2020 18:19
src/cargo/util/toml/mod.rs Outdated Show resolved Hide resolved
Defined(T),
}

impl<'de, T> de::Deserialize<'de> for MaybeWorkspace<T>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move all the impl Deserialize and Serialize to its own file - maybe src/cargo/util/toml/serde.rs? I think it would help make mod.rs more skimmable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me! Might be best to defer that to a second PR though

@yaymukund yaymukund force-pushed the dedup-cargo-workspaces branch from b582eba to 0170d00 Compare September 7, 2020 15:45
@yaymukund yaymukund force-pushed the dedup-cargo-workspaces branch from 0170d00 to 9bddf28 Compare September 8, 2020 02:32
@yaymukund
Copy link
Contributor Author

This is now ready for a review.

(Posting this in case changing GH draft status didn't trigger a notification.)

@alexcrichton
Copy link
Member

Thanks for this again @yaymukund! I'll set aside some time tomorrow to review this

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok here's an initial round of comments, thanks so much again for helping to take this on! My main thoughts can largely be summarized with:

  • This is introducing a number of duplicated types in TOML parsing where one has workspace stuff in it and the other doesn't. This ends up having a lot of extra boilerplate to convert back-and-forth, but I was wondering if it would be possible to avoid doing that? Ideally we would only have one type which is perhaps mutated in-place after being deserialized to remove all the internal workspace references.

  • The frequency of parsing manifests and finding the workspace root is a little worrisome to me. It looks like find_workspace_root is called many more times than previously, and I'm a little worried about the performance. I largely just want to make sure that we're not parsing toml twice nor are we traversing the filesystem more than necessary. I think that the toml parsing cache should be sufficient for this, but a cache may also need to be introduced for finding the workspace root. This could actually be more efficient than it is today where you could, for all levels of a path, insert a cache entry saying where the root was found (or that a root wasn't found), so sub-crates wouldn't have to do as many filesystem operations.

    Another thing that I noticed is that each manifest, when parsed, independently finds the workspace root and parses it. Could this instead be refactored a bit? Ideally we'd find the workspace root only once and then pass that as a parameter to parsing so each member doesn't have to independently find the root, and that also ensures that we only parse the root once.

I've only skimmed the tests lightly for now, but I'm hoping that the internals can be simplified a bit, but I'm curious what you think!

@@ -0,0 +1,102 @@
use std::collections::{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah mod.rs is pretty huge as-is, so I think it's fine for this to be its own file.

}
}

pub fn parse_manifest<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments to what this is doing? I didn't immediately understand what you meant by there was a circular dependency in manifest parsing in the PR description. I originally thought this was a performance-related cache but it sounds like this is required for correctness?

Copy link
Contributor Author

@yaymukund yaymukund Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a more succinct comment describing this but here is an explanation of the correctness issue:

do_read_manifest() used to be self-contained:

do_read_manifest()

Now, do_read_manifest() potentially needs the workspace root in order to process MaybeWorkspace values:

do_read_manifest() -> find_root()

find_root() in turn tries to read the root workspace. A manifest can be its own root, so it forms a cycle:

do_read_manifest() -> find_root() -> do_read_manifest() (-> find_root() -> do_read_manifest() -> ...)

To break this cycle, I added the ManifestCache which lets you shallow parse (parse_manifest) - i.e. deserializes & gives you access to the toml - without calling all of do_read_manifest().

do_read_manifest() -> find_workspace_root() -> parse_manifest()

This breaks the cycle while also ensuring we never parse the same manifest twice.

use crate::util::{short_hash, Config, Filesystem};

pub enum EitherManifest {
Real(Manifest),
Virtual(VirtualManifest),
}

impl EitherManifest {
pub fn workspace(&self) -> Rc<WorkspaceConfig> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I see a number of methods here returning a bare Rc<T>, but if possible I think it would be best to return &Rc<T> since it defers the clone to the point at which it's actually necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a perf concern since the reference counting is almost nothing in Cargo, mostly just a stylistic thing.

}

ws.load_current()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this was added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, ws.current() & ws.current_mut() fail.

We were previously relying on ws.find_root() implicitly loading the current package into ws.packages. Since find_workspace_root does a shallow parse, we now have to do it explicitly.

.workspace_config(manifest_path.parent().unwrap(), config)
}

{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW nowadays this extra block isn't necessary

Defined(T),
}

impl<'de, T> de::Deserialize<'de> for MaybeWorkspace<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me! Might be best to defer that to a second PR though

#[derive(Serialize, Clone, Debug)]
#[serde(untagged)]
pub enum MaybeWorkspace<T> {
Workspace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this leverage serde instead of writing a custom Deserialize by defining this as Workspace { workspace: bool }?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or something like Workspace(TomlWorkspaceField)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could change it to one of those forms, but we also need the Deserializer for its error message because the default "untagged enum" error messages are opaque. (Remember? 😋 😅 )

There are a couple open issues about this:

serde-rs/serde#1544
serde-rs/serde#773
serde-rs/serde#1883

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW with a custom Deserialize implementation this should be able to customize error messages as well, I think we have one or two implementations that do that already?


#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum MaybeWorkspaceBadge {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible to be MaybeWorkspacee<BTreeMap<...>>?

Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum also exists for the error messaging when deserialize fails. How would we implement MaybeWorkspace's visit_map() to give an adequate error message for MaybeWorkspace<BTreeMap<_>> fields?

(Some(MaybeWorkspace::Defined(value)), _) => Ok(Some(value)),
(Some(MaybeWorkspace::Workspace), Some(ws)) => f(ws)
.clone()
.ok_or_else(|| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this is an error? If a value isn't defined in a package or in a workspace, then the final value is None, right?

Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC says (emphasis mine):

Cargo will now accept a table definition of package.$key which defines the package.$key.workspace key as a boolean. For example you can specify:

[package]
name = "foo"
license = { workspace = true }

This directive indicates that the license of foo is the same as workspace.license. If workspace.license isn't defined then this generates an error.

This code corresponds to the section I bolded, unless I misunderstood.

https://github.com/rust-lang/rfcs/blob/master/text/2906-cargo-workspace-deduplicate.md#placeholder-values

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see nah that makse sense, thanks!

}

#[derive(Clone, Debug)]
struct DefinedTomlPackage {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid duplicate type definitions like TomlProject and DefinedTomlPackage? I was hoping that we could get by where after some point in Cargo it's just statically known that the type here doesn't have workspace = true anywhere, and otherwise the "real" deserialized forms of things like Manifest and Summary in Cargo don't have to deal with the toml details at all.

Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid duplicate type definitions like TomlProject and DefinedTomlPackage?

  • TomlManifest, TomlProject: These are needed for their #[derive(Deserialize)].
  • DefinedTomlManifest, DefinedTomlPackage: These provide the static guarantees since the types don't have any MaybeWorkspace.

If we're willing to relax the second requirement, I could do something like:

struct TomlManifest {
    version: MaybeWorkspace<Version>,
}

impl TomlManifest {
    // parse manifest if required & define all `MaybeWorkspace::Workspace` fields.
    fn define_workspace_fields(&mut self) {
        self.version = MaybeWorkspace::Defined(...);
    }

    fn unwrap_version(&self) -> semver::Version {
        // "unwrap()" a `MaybeWorkspace`, panicking if it's { workspace = true }
        self.version.unwrap();
    }
}

This still requires us to always call define_workspace_fields() before we try to unwrap_version() - so it's not as strong of a guarantee - but it would allow us to eliminate the duplication between TomlManifest and TomlProject. Is that what you had in mind?

I was hoping that we could get by where after some point in Cargo it's just statically known that the type here doesn't have workspace = true anywhere

DefinedTomlManifest was my attempt at doing this. It's returned by prepare_for_publish() and stored in Manifest. Aside from parse_manifest() - which needs to be shallow to avoid a circular dependency* - the rest of Cargo doesn't have access to MaybeWorkspace<_> fields.

* see this comment for details

Copy link
Contributor Author

@yaymukund yaymukund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks v much for the review! I tried to address some of your higher level concerns & will work on the smaller fixes starting next week.

This is introducing a number of duplicated types in TOML parsing where one has workspace stuff in it and the other doesn't.

I have posted more details in this inline comment, so maybe we can discuss there.

I think that the toml parsing cache should be sufficient for this, but a cache may also need to be introduced for finding the workspace root. This could actually be more efficient than it is today where you could, for all levels of a path, insert a cache entry saying where the root was found (or that a root wasn't found), so sub-crates wouldn't have to do as many filesystem operations.

Another thing that I noticed is that each manifest, when parsed, independently finds the workspace root and parses it. Could this instead be refactored a bit? Ideally we'd find the workspace root only once and then pass that as a parameter to parsing so each member doesn't have to independently find the root, and that also ensures that we only parse the root once.

I really like both of these ideas. The first should be relatively straightforward. The second might be more involved, but I'll have a look.

}

ws.load_current()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, ws.current() & ws.current_mut() fail.

We were previously relying on ws.find_root() implicitly loading the current package into ws.packages. Since find_workspace_root does a shallow parse, we now have to do it explicitly.

#[derive(Serialize, Clone, Debug)]
#[serde(untagged)]
pub enum MaybeWorkspace<T> {
Workspace,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could change it to one of those forms, but we also need the Deserializer for its error message because the default "untagged enum" error messages are opaque. (Remember? 😋 😅 )

There are a couple open issues about this:

serde-rs/serde#1544
serde-rs/serde#773
serde-rs/serde#1883


#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum MaybeWorkspaceBadge {
Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum also exists for the error messaging when deserialize fails. How would we implement MaybeWorkspace's visit_map() to give an adequate error message for MaybeWorkspace<BTreeMap<_>> fields?

(Some(MaybeWorkspace::Defined(value)), _) => Ok(Some(value)),
(Some(MaybeWorkspace::Workspace), Some(ws)) => f(ws)
.clone()
.ok_or_else(|| {
Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC says (emphasis mine):

Cargo will now accept a table definition of package.$key which defines the package.$key.workspace key as a boolean. For example you can specify:

[package]
name = "foo"
license = { workspace = true }

This directive indicates that the license of foo is the same as workspace.license. If workspace.license isn't defined then this generates an error.

This code corresponds to the section I bolded, unless I misunderstood.

https://github.com/rust-lang/rfcs/blob/master/text/2906-cargo-workspace-deduplicate.md#placeholder-values

}

#[derive(Clone, Debug)]
struct DefinedTomlPackage {
Copy link
Contributor Author

@yaymukund yaymukund Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid duplicate type definitions like TomlProject and DefinedTomlPackage?

  • TomlManifest, TomlProject: These are needed for their #[derive(Deserialize)].
  • DefinedTomlManifest, DefinedTomlPackage: These provide the static guarantees since the types don't have any MaybeWorkspace.

If we're willing to relax the second requirement, I could do something like:

struct TomlManifest {
    version: MaybeWorkspace<Version>,
}

impl TomlManifest {
    // parse manifest if required & define all `MaybeWorkspace::Workspace` fields.
    fn define_workspace_fields(&mut self) {
        self.version = MaybeWorkspace::Defined(...);
    }

    fn unwrap_version(&self) -> semver::Version {
        // "unwrap()" a `MaybeWorkspace`, panicking if it's { workspace = true }
        self.version.unwrap();
    }
}

This still requires us to always call define_workspace_fields() before we try to unwrap_version() - so it's not as strong of a guarantee - but it would allow us to eliminate the duplication between TomlManifest and TomlProject. Is that what you had in mind?

I was hoping that we could get by where after some point in Cargo it's just statically known that the type here doesn't have workspace = true anywhere

DefinedTomlManifest was my attempt at doing this. It's returned by prepare_for_publish() and stored in Manifest. Aside from parse_manifest() - which needs to be shallow to avoid a circular dependency* - the rest of Cargo doesn't have access to MaybeWorkspace<_> fields.

* see this comment for details

@@ -134,7 +134,7 @@ fn verify_dependencies(
registry_src: SourceId,
) -> CargoResult<()> {
for dep in pkg.dependencies().iter() {
if dep.source_id().is_path() || dep.source_id().is_git() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, good catch, I shall fix.

}
}

pub fn parse_manifest<'a>(
Copy link
Contributor Author

@yaymukund yaymukund Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a more succinct comment describing this but here is an explanation of the correctness issue:

do_read_manifest() used to be self-contained:

do_read_manifest()

Now, do_read_manifest() potentially needs the workspace root in order to process MaybeWorkspace values:

do_read_manifest() -> find_root()

find_root() in turn tries to read the root workspace. A manifest can be its own root, so it forms a cycle:

do_read_manifest() -> find_root() -> do_read_manifest() (-> find_root() -> do_read_manifest() -> ...)

To break this cycle, I added the ManifestCache which lets you shallow parse (parse_manifest) - i.e. deserializes & gives you access to the toml - without calling all of do_read_manifest().

do_read_manifest() -> find_workspace_root() -> parse_manifest()

This breaks the cycle while also ensuring we never parse the same manifest twice.

@bors
Copy link
Contributor

bors commented Sep 23, 2020

☔ The latest upstream changes (presumably #8165) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@alexcrichton
Copy link
Member

Sorry again for the slow response :(

In any case though responding to this thread because it was probably the meatiest thing, I wonder if we could hit a few birds with one stone. Could we perhaps change the return value of the toml parsing of manifests to not be a Manifest but something like an IntermediateManifest? That would have almost everything "elaborated" except for metadata fields and some dependency stuff, and then you would fuse an IntermediateManifest with a WorkspaceConfig to get a real Manifest. That way parsing wouldn't need to find the root manifest in a workspace, we I don't think would need a cache since whatever is parsing manifests knows how to pair with a workspace, and if we do it right I think we can avoid a parallel set of types (something that already exists today may suffice as an IntermediateManifest).

For example we could take the raw TomlManifest and return that, and then it would have methods to create a first-class Manifest given workspace configuration.

@yaymukund
Copy link
Contributor Author

I'll double back and try your new approach probably sometime this weekend.

@yaymukund
Copy link
Contributor Author

Sorry, my life got suddenly busy but I'll put in some hours tomorrow to see how far I can get.

@ParkMyCar
Copy link

@yaymukund let me know if there is anyway I can help with this, more than happy to 😊

@yaymukund
Copy link
Contributor Author

Hello, I apologize that this is taking a minute. I think the requested changes would necessitate rewriting much of the branch, and I haven't found an uninterrupted length of time to commit to that.

@ParkMyCar Thanks for your offer! I can take a look over the weekend and try to split off some meaningful subset of work? Or if you're itching to get started, feel free to message me before that (my github username at gmail or twitter dm).

@ParkMyCar
Copy link

@yaymukund No problem at all, life comes before open source 🙂

There seems to be a lot of comments/context for this task that I still need to read. I'll spend some time catching up on that and this weekend or later we can figure out how to split up the work, sound good?

@yaymukund
Copy link
Contributor Author

yaymukund commented Jan 18, 2021

After taking another look at this, I've realized I don't have the time to continue working on it. Feel free to pick this up if you're interested, @ParkMyCar . I think you'd basically need to start from scratch with Alex's revised approach, but I think many parts (e.g. the serde logic) could be reused if you so choose. I've tried to document my decisions as I went along in the form of PR comments, but am also happy to answer any questions to save future contributors from duplicating my efforts.

@ParkMyCar
Copy link

@yaymukund Sounds good, I'll get started on this sometime this week or next weekend. Thanks for getting us this far and leaving detailed comments 😊

@yaymukund
Copy link
Contributor Author

Just a note: serde-rs/serde#1883 has been resolved, so you should be able to customize the error message without having to do all the work of impl Deserialize. This may also allow us to improve existing deserializing code in master.

@XAMPPRocky
Copy link
Member

@ParkMyCar Hey, I was wondering if you were still interested in contributing this feature to cargo? 🙂

bors added a commit that referenced this pull request Mar 20, 2021
Use serde's error message option to avoid implementing `Deserialize`.

This is a cleanup based on serde-rs/serde#1883, which fell out of #8664

It looks like this changes the resulting error messages. I'll leave it up to you to decide if the tradeoff makes sense here. Maybe the correct answer here is to make `serde`'s error messages include more details (e.g. `invalid type: int`).
@rahulanand16nov
Copy link

How's the outlook on merging this PR?

@Muscraft
Copy link
Member

Muscraft commented Jun 1, 2021

I'm trying to take a look at it and see if I can make the changes talked about

@Muscraft
Copy link
Member

I'm hoping to open a new PR in the next few days with the changes above to try and get them merged, I will update with a link once its ready

@Muscraft
Copy link
Member

Muscraft commented Jul 5, 2021

Sorry for the delay ran into a few problems with grabbing the right files for publishing and I'm trying to clean that up

@Muscraft
Copy link
Member

New pull request is here!

@joshtriplett
Copy link
Member

For now, I'm going to close this in favor of #9684 ; if the situation changes we can always reopen this.

bors added a commit that referenced this pull request Mar 25, 2022
Part 1 of RFC2906 - Packages can inherit fields from their root workspace

Tracking issue: #8415
RFC: rust-lang/rfcs#2906

All changes were heavily inspired by #8664 and #9684

A big thanks to both `@yaymukund` and `@ParkMyCar.` I pulled a lot of inspiration from their branches.

I would also like to thank `@alexcrichton` for all the feedback on the first attempt. It has brought this into a much better state.

All changes have been made according to the RFC as well as `@alexcrichton's` [comment](#8664 (comment)).

This is part 1 of many, as described by [this comment](#9684 (comment)), [this comment](#9684 (review)) and redefined  [by this one](#10497 (comment)).

This PR focuses on inheriting in root package, including:
- Add `MaybeWorkspace<T>` to allow for `{ workspace = true }`
- Add a new variant to `TomlDependency` to allow inheriting dependencies from a workspace
- Add `InheritableFields` so that we have somewhere to get a value from when we `resolve` a `MaybeWorkspace<T>`
  - `license_file` and `readme` are in `InheritableFields` in this part but are not `MaybeWorkspace` for reasons [described here](#10497 (comment))
- Add a method to `resolve` a `MaybeWorkspace<T>` into a `T` that can fail if we have nowhere to pull from or the workspace does not define that field
- Disallow inheriting from a different `Cargo.toml`
  - This means that you can only inherit from inside the same `Cargo.toml`, i.e. it has both a `[workspace]` and a `[package]`
  - Forcing this requirement allows us to test the implementations of `MaybeWorkspace<T>` and the new `TomlDependency` variant without having to add searching for a workspace root and the complexity with it

Remaining implementation work for the RFC
- Support inheriting in any workspace member
- Correctly inherit fields that are relative paths
  - Including adding support for inheriting `license_file`,  `readme`, and path-dependencies
-  Path dependencies infer version directive
- Lock workspace dependencies and warn when unused
- Optimizations, as needed
- Evaluate any new fields for being inheritable (e.g. `rust-version`)
- Evaluate making users of `{ workspace = true }` define the workspace they pull from in `[package]`

Areas of concern:
- `TomlDependency` Deserialization is a mess, and I could not figure out how to do it in a cleaner way without significant headaches. I ended up having to do the same thing as last time [which was an issue](#9684 (comment)).
- Resolving on a `MaybeWorkspace` feels extremely verbose currently:
  ```rust
  project.homepage.clone().map(|mw| mw.resolve(&features, "homepage", || get_ws(inheritable)?.homepage())).transpose()?,
  ```
  This way allows for lazy resolution of inheritable fields and finding a workspace (in part 2) so you do not pay a penalty for not using this feature. The other bit of good news is `&features` will go away as it is just feature checking currently.

- This feature requires some level of duplication of code as well as remaking the original `TomlManifest` which adds extra length to the changes.
- This feature also takes a linear process and makes it potentially non-linear which adds lots of complexity (which will get worse in Part 2)

Please let me know if you have any feedback or suggestions!
@yaymukund yaymukund deleted the dedup-cargo-workspaces branch February 27, 2023 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants