-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLI] Tree shaking for package dependencies #21211
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
2 Skipped Deployments
|
@@ -102,6 +104,8 @@ pub struct BuildConfig { | |||
/// The chain ID that compilation is with respect to (e.g., required to resolve | |||
/// published dependency IDs from the `Move.lock`). | |||
pub chain_id: Option<String>, | |||
/// Include unpublished dependencies in the build | |||
pub with_unpublished_dependencies: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cleaner to have it in the BuildConfig rather than in the build
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate? Without more context this choice seems strange, because this flag should not affect the outcome of a build at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having seen the rest of the PR, I don't think this is the right course of action. Probably a less invasive move would be to make it so that building does not consume the ResolvedGraph
, but returns it along with the CompiledPackage
instead. Then you will have enough information as part of publish
/upgrade
to do the tree shaking there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build is called in a lot more places than creating the BuildConfig struct directly (usually it's via default), so thought it's easier to keep it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the ideal thing would be to put the ResolvedGraph
in the CompiledPackage
(your point about the graph being stored in BuildPlan
is also nudging me in that direction).
self.dependency_ids.published.values().cloned().collect() | ||
} | ||
|
||
/// Tree-shake the package's dependencies to remove any that are not referenced in source code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main tree shaking part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc comment needs a bit more detail:
- Distinguishing how tree shaking behaves with immediate dependencies vs transitive dependencies.
- Distinguishing between module dependencies and package dependencies (we eliminate immediate package dependencies when there is no root module with an immediate module dependency to it, but from there onwards we follow edges in the package graph).
pub fn get_transitive_dependencies<N, E>(graph: &DiGraphMap<N, E>, start: &N) -> BTreeSet<N> | ||
where | ||
N: Ord + Copy + Eq + std::hash::Hash, | ||
{ | ||
let mut visited = BTreeSet::new(); | ||
let mut dfs = Dfs::new(graph, *start); | ||
|
||
// Skip the start node itself | ||
dfs.next(graph); | ||
|
||
// Visit all reachable nodes | ||
while let Some(n) = dfs.next(graph) { | ||
visited.insert(n); | ||
} | ||
|
||
visited | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure if we want this in external-crates
on the DependencyGraph
type or we should keep it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be useful on DependencyGraph
-- yes! It also means you don't have to add the dependency on petgraph
here.
let (upgrade_policy, compiled_package) = | ||
upgrade_result.map_err(|e| anyhow!("{e}"))?; | ||
|
||
let compiled_modules = | ||
compiled_package.get_package_bytes(with_unpublished_dependencies); | ||
let package_id = compiled_package.published_at.clone()?; | ||
let package_digest = | ||
compiled_package.get_package_digest(with_unpublished_dependencies); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a refactoring of the compile_package
function which used to return a 4-tuple and now returns a CompiledPackage
, which holds the same information.
@stefan-mysten, what's the relationship between this PR and #21168 ? From the descriptions they seem like they would be doing the same thing. |
This implements the tree shaking without the need to go on chain and fetch linkage tables. It supersedes the previous one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core tree shaking change looks good to me (praise be, thank you for sticking with it). I'm not a fan of adding with_unpublished_dependencies
to BuildConfig
though, because it has nothing to do with the build (I think that's clear from all the places you now have to add a value for that field that should have no business caring about unpublished dependencies). I've left some suggestions to address that:
- Don't consume the
ResolvedGraph
during building, that way you avoid cloning it to do the tree shaking. Consider adding it or theDependencyGraph
into theCompiledPackage
so that you have access to it later to do the tree-shaking. - Instead of making tree shaking modify the published dependency ids just after you build the
CompiledPackage
, expose a function that calculates the list of published dependencies, post-tree-shaking. This is great because you avoid having to manage a side-effectful operation, and you no longer need to threadwith_unpublished_dependencies
into the build process (whatever cares about building the linkage table for publish/upgrade will just call this function to get the list of packages, and at that pointwith_unpublished_dependencies
should be readily at hand).
@@ -102,6 +104,8 @@ pub struct BuildConfig { | |||
/// The chain ID that compilation is with respect to (e.g., required to resolve | |||
/// published dependency IDs from the `Move.lock`). | |||
pub chain_id: Option<String>, | |||
/// Include unpublished dependencies in the build | |||
pub with_unpublished_dependencies: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate? Without more context this choice seems strange, because this flag should not affect the outcome of a build at all.
@@ -292,9 +310,9 @@ pub fn build_from_resolution_graph( | |||
} | |||
|
|||
let result = if print_diags_to_stderr { | |||
BuildConfig::compile_package(resolution_graph, &mut std::io::stderr()) | |||
BuildConfig::compile_package(resolution_graph.clone(), &mut std::io::stderr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolution graph is not a small structure -- can we take it by reference here for the build instead of cloning it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's passed all the way down to BuildPlan
type (see BuildPlan::create
function), so probably not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach would be to have the BuildPlan
be consumed on build
so that you can take the ResolvedGraph
out of it and put it in the CompiledPackage
at that point?
let root_modules = if with_unpublished_deps { | ||
&self | ||
.package | ||
.all_compiled_units_with_source() | ||
.filter(|m| m.unit.address.into_inner() == AccountAddress::ZERO) | ||
.map(|x| x.unit.clone()) | ||
.collect::<Vec<_>>() | ||
} else { | ||
&self | ||
.package | ||
.root_modules() | ||
.map(|x| x.unit.clone()) | ||
.collect::<Vec<_>>() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know what I'm going to say...
let root_modules = if with_unpublished_deps { | |
&self | |
.package | |
.all_compiled_units_with_source() | |
.filter(|m| m.unit.address.into_inner() == AccountAddress::ZERO) | |
.map(|x| x.unit.clone()) | |
.collect::<Vec<_>>() | |
} else { | |
&self | |
.package | |
.root_modules() | |
.map(|x| x.unit.clone()) | |
.collect::<Vec<_>>() | |
}; | |
let root_modules: Vec<_> = if with_unpublished_deps { | |
&self | |
.package | |
.all_compiled_units_with_source() | |
.filter(|m| m.unit.address.into_inner() == AccountAddress::ZERO) | |
.map(|x| x.unit.clone()) | |
.collect() | |
} else { | |
&self | |
.package | |
.root_modules() | |
.map(|x| x.unit.clone()) | |
.collect() | |
}; |
/// Return the transitive dependencies of a node in a directed graph | ||
pub fn get_transitive_dependencies<N, E>(graph: &DiGraphMap<N, E>, start: &N) -> BTreeSet<N> | ||
where | ||
N: Ord + Copy + Eq + std::hash::Hash, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does the Hash
constraint come from?
self.dependency_ids.published.values().cloned().collect() | ||
} | ||
|
||
/// Tree-shake the package's dependencies to remove any that are not referenced in source code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc comment needs a bit more detail:
- Distinguishing how tree shaking behaves with immediate dependencies vs transitive dependencies.
- Distinguishing between module dependencies and package dependencies (we eliminate immediate package dependencies when there is no root module with an immediate module dependency to it, but from there onwards we follow edges in the package graph).
pub fn get_transitive_dependencies<N, E>(graph: &DiGraphMap<N, E>, start: &N) -> BTreeSet<N> | ||
where | ||
N: Ord + Copy + Eq + std::hash::Hash, | ||
{ | ||
let mut visited = BTreeSet::new(); | ||
let mut dfs = Dfs::new(graph, *start); | ||
|
||
// Skip the start node itself | ||
dfs.next(graph); | ||
|
||
// Visit all reachable nodes | ||
while let Some(n) = dfs.next(graph) { | ||
visited.insert(n); | ||
} | ||
|
||
visited | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be useful on DependencyGraph
-- yes! It also means you don't have to add the dependency on petgraph
here.
crates/sui/src/client_commands.rs
Outdated
compiled_package.get_package_digest(with_unpublished_dependencies); | ||
|
||
// filter out dependencies that are not referenced in the source code. | ||
// find_pkg_dependencies(&client, &mut compiled_package).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove?
crates/sui/tests/cli_tests.rs
Outdated
.path() | ||
.join("tree_shaking") | ||
.join("F_depends_on_A_as_bytecode_dep"); | ||
// let (package_f_id, _) = publish_package( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test for a case where, e.g. B_depends_on_A_no_code_references
was published without tree-shaking with a dependency on A
at v1, and then C
was published depending on that B
, and A
at v2. Its linkage table should still contain a mapping for A
at v2.
Also if you can, it would be good to split up checks across multiple tests, otherwise if there are multiple failures, you will only see one at a time.
pkgs_to_keep.extend(self.bytecode_deps.iter().map(|(name, _)| *name)); | ||
|
||
// #5 | ||
self.dependency_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than having tree-shaking be a mutating operation, expose a function that returns the list of published dependencies which does the tree shaking and then returns the appropriate list.
e7229a7
to
05216c2
Compare
Thanks for the review @amnn. I'll have a stab at it later today! |
Description
This PR introduces tree shaking for package dependencies. The package dependencies that are not referenced in code (directly or via trans deps) are removed.
Test plan
Added tests in
sui/tests/data/tree_shaking
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.
sui move build
andsui client publish/upgrade
will now by default remove dependencies that are not referenced in source code from being published on-chain.