Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about theseus_cargo #852

Closed
crazyboycjr opened this issue Feb 23, 2023 · 6 comments
Closed

Questions about theseus_cargo #852

crazyboycjr opened this issue Feb 23, 2023 · 6 comments
Labels

Comments

@crazyboycjr
Copy link

crazyboycjr commented Feb 23, 2023

This is the general question about how theseus_cargo works. I am facing an almost identical issue mentioned in this doc https://www.theseus-os.com/Theseus/book/building/rust_builds_out_of_tree.html, and I find it quite inspiring. I am currently implementing a similar cargo wrapper that captures cargo's output commands and replaces the arguments (such as --extern) and rerun the recreated command. I'm thinking maybe there is a better way to handle dependency replacement (or more likely it is not a problem in Theseus).

In theseus_cargo, there is a prebuilt_crates_set that maps a crate name to a crate name with a hash suffix. In reality (maybe just in my more general case), there could be multiple crates under input_dir_path sharing the same crate name but with different hash suffixes. They may be compiled from different semantic versions of a package or different features (not sure if this could be the case in the presence of feature unification). Now given a rustc command, how do I know whether I should replace one of its --extern argument with one of the multiple crates that has the same crate-name but different suffixes under input_dir_path?

(You could skip reading the following example if you already understand the problem.) Let's say I have a original command (environment variables are omitted)

rustc --crate-name indexmap --edition=2021 /home/cjr/.
cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/indexmap-1.9.1/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 \
    -C metadata=7a5cc09d7b2b7e0a -C extra-filename=-7a5cc09d7b2b7e0a \
    --out-dir /home/cjr/Developing/phoenix/target/debug/deps \
    -L dependency=/home/cjr/Developing/phoenix/target/debug/deps \
    --extern hashbrown=/home/cjr/Developing/phoenix/target/debug/deps/libhashbrown-e520009dc7a7a6ee.rmeta \
    --cap-lints warn -Zunstable-options -Zbinary-dep-depinfo -Z binary-dep-depinfo --cfg has_std

After the passing through the cargo wrapper, this compile command turns into

"rustc" "--crate-name" "indexmap" "--edition" "2021" "/home/cjr/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/indexmap-1.9.1/src/lib.rs" "--crate-type" "lib" "--emit" "dep-info,metadata,link" "-C" "embed-bitcode=no" "-C" "debuginfo=2" \
    "-C" "metadata=7a5cc09d7b2b7e0a" "-C" "extra-filename=-7a5cc09d7b2b7e0a" \
    "--out-dir" "/home/cjr/Developing/phoenix/target/debug/deps" \
    "-L" "dependency=/home/cjr/Developing/phoenix/target/debug/deps" \
    "--extern" "hashbrown=/home/cjr/Developing/phoenix/target/release/deps/libhashbrown-b539cb8f16cbe55b.rmeta" \
    "--cap-lints" "warn" "-Z" "unstable-options" "-Z" "binary-dep-depinfo" "-Z" "binary-dep-depinfo" "--cfg" "has_std" \
    "-L" "/home/cjr/Developing/phoenix/target/release/deps"

In short, the dependency hashbrown for indexmap has been changed to libhashbrown-b539cb8f16cbe55b.rmeta/rlib under target/release. However, under target/release, I have more than one versions of libhashbrown (0.11.2, 0.12.3, 0.13.1).

My current takeway is that I need someway to check whether the sementic version and features of an extern argument is compatible with the one in prebuilt_crates_set. This check obviously further complicates this wrapper but seems doable.

I don't think the situation is handled in Theseus's cargo probably because it's not happening in your case. @kevinaboos Could you give a hint on whether it is a real issue or how theseus avoid this situation? Thanks a lot!

@kevinaboos
Copy link
Member

kevinaboos commented Feb 23, 2023

Great question! theseus_cargo is certainly a hacky approach, and I'm not even sure if I would recommend using it 😅 . You definitely identified a real issue, but I have been diligent to ensure that we only use one version of each dependency, for that very reason. Off the top of my head, I don't know of a way to determine a mapping from a specific version of a library to an outputted file+hash for that library version, but I presume some folks in the cargo Zulip channel may be able to help.

If you do find a solution, definitely let us know!

@crazyboycjr
Copy link
Author

Thanks for the hint @kevinaboos and quick response! Carefully ensuring that only one version of each dependency is used definitely makes sense to me. That's one direction I can look into.

If all we need is to check the semantic version and feature compatibility (which both are high-level concepts in cargo rather than what rustc understands, leading to the absense of the package version in the rmeta) of two crates. I guess what we need is probably a past history of cargo log, which fortunately displays the package version in the env part of a command and the features after --cfg. Like this

... CARGO_PKG_VERSION=1.2.1 ... rustc src/lib.rs --cfg feature="use_std" ...

Which sounds even more hacky... The unit-graph feature might also be useful and more structured. I hope this might be also useful for you (or maybe in the future).

Anyway all I want to know is the situation in theseus and hopefully getting some comments on how it sounds (how likely it's gonna work) to proceed with the more hacky way which is to maintain a compile history. Feel free to close the issue if you don't have any other comment 😋. Thanks!

@kevinaboos
Copy link
Member

I actually remember exploring unit-graph and build-plan (back before it was slated for removal) but neither one provided sufficient information to recreate a fully-detailed cargo or rustc invocation from scratch.... hence why i had to go with the approach of capturing verbose rustc invocations from cargo.

But I hadn't yet considered whether they would provide information on crate version --> crate output file. That's not a bad idea, hope it works out for you.

One problem with maintaining a compile "history" is that rust flags and many other aspects of a compilation target will affect the metadata/hashes appended to an outputted crate library file, so I don't think that approach is likely to work. I could be wrong though, having not tried it myself.

@crazyboycjr
Copy link
Author

I see. Thanks for the comments!

I guess it should be fine that many aspects would affect the metadata/hashes appeded to a crate name. As long as there's a log of how each crate library file was compiled, I can probably maintain a mapping of each crate-name-hash to its version and features.

I can imagine there will be many potential pitfalls that will make the rustc compiler unhappy by messing with this thing, but I guess l have to give it a try and will let you know the result.

@crazyboycjr
Copy link
Author

Hey @kevinaboos, just want to let you know that I have implemented and tested the feature I mentioned for my project. As a PoC, I think it generally works as expected! You can find the code here: https://github.com/phoenix-dataplane/phoenix/blob/21a9733ffd4d3f813ad46db50da9e3678fc6733b/tools/phoenix_cargo/src/main.rs#L188

I basically made a few changes based on your code: different than theseus_cargo, phoenix_cargo
+ supports dylib and proc_macro
+ supports choosing a crate among multiple builds (with different versions or features) of the same dependency crate.
- does not currenlty handle cross-compiling

I recently also have taken another look at build-plan once again and it makes me curious what was missing from the output (an example output here). It looks quite comprehensive to me. Also, there is this crate: build-plan. I haven't tried it but will take a look soon.

My next step will be to parallelize the compilation and bring incremental compilation back. Is there anything particular you want me to know before proceeding? Thanks!

@crazyboycjr
Copy link
Author

It looks like there are many issue with build-plan according to this discussion: rust-lang/cargo#5579. Then I guess I'll just pass it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants