-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[prover-v2] Connecting Prover to v2 code generation toolchain #12534
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12534 +/- ##
========================================
Coverage 64.0% 64.1%
========================================
Files 818 818
Lines 181627 181930 +303
========================================
+ Hits 116377 116649 +272
- Misses 65250 65281 +31 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should consider moving the 2 flagged changes to move semantics to a separate PR so they can have a distinct commit message explaining what they're about.
@@ -35,6 +39,8 @@ pub struct FunctionGenerator<'a> { | |||
locals: Vec<Type>, | |||
/// A map from branching labels to information about them. | |||
label_info: BTreeMap<Label, LabelInfo>, | |||
/// A mao from code offset to spec blocks associated with them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mao -> map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
name: "v2", | ||
flags: &[], | ||
inclusion_mode: InclusionMode::Implicit, | ||
enable_in_ci: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting enable_in_ci
to true
seems questionable. Before it was disabled with comment "Do not enable in CI until we have more data about stability". Do we have that stability data now?
/// the difference between targets and dependencies for verification. | ||
/// Those tools can temporarily set this to true. | ||
pub fn treat_everything_as_target(&self, on: bool) { | ||
*self.everything_is_target.borrow_mut() = on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please file an issue to do something to clean this up eventually (much later). For example, we may want to shake the tree and only include the dependences which are reachable from the original targets. Or we might want to distinguish the different levels (to be compiled, to be generated as code, to be included as needed). The mutable nature here is not ideal, as it could be dangerous in the long term. But it's fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already minimizing the tree in the module_builder somewhere. For the other issue see above.
}, | ||
AssignKind::Store => { | ||
AssignKind::Copy | AssignKind::Store => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*** This (adding an edge for AssignKind::Copy
) seems to be a significant change for borrow analysis. Might be better to give it its own commit/PR so it can have a reason attached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is why I opened #12533.
@@ -1012,7 +1012,7 @@ impl<'env, 'state> LifetimeAnalysisStep<'env, 'state> { | |||
fn check_borrow_safety(&mut self, temps_vec: &[TempIndex]) { | |||
// First check direct duplicates | |||
for (i, temp) in temps_vec.iter().enumerate() { | |||
if temps_vec[i + 1..].contains(temp) { | |||
if self.ty(*temp).is_mutable_reference() && temps_vec[i + 1..].contains(temp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*** This also is a pretty big deal for borrow analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was a serious bug. The one-and-only the prover integration found so far.
error_writer: &mut impl WriteColor, | ||
mut options: Options, | ||
) -> anyhow::Result<GlobalEnv> { | ||
options.whole_program = true; // will set `treat_everything_as_target` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Everything as target" is probably not is wanted in the general analysis case. Our various compiler warning checks need to know the difference between "warn about this code" and "don't warn about this code". Currently they are using "target" to decide whether to warn about code, but for certain targeted analyses you would want to do different things for "real target" and "target because it's a dependency". That may mean we just need a new flag for "generate warnings/errors about this code".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You usually expect that the prover is run an code which already passes type checks. But if not, we can't really differentiate whether the type check error is in the set targets or not, so we better produce errors in the dependencies as well. I think a really sustainable solution for the problem is that the package system somehow tracks this for us and/or we have some on-disk storage of compiled packages. Keeping for now like this until we have a better solution, perhaps after package system refactoring.
Bytecode::SpecBlock(_, spec) => { | ||
// All Temporaries used in the spec need to be pinned. Notice that | ||
// any bound variables inside the spec are LocalVar, so we can just | ||
// unconditionally collect them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
them --> all Temporary vars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have anything to do with your change, or was it just not regenerated recently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not look at md
which is generated but at the Move source of the same name. There is a bug linked there already and a TODO (#12501).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too, does it have any relevance, or was it just stale before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment on the .move source, this here is generated.
This was a test which was borderline regards timeout, and clearly times out with v2, I deactivated for CI using the pragma as here, as we do in many cases.
@@ -846,4 +865,12 @@ impl<'env> ModuleContext<'env> { | |||
} | |||
result | |||
} | |||
|
|||
/// Converts to a name with location as expected by the SourceMap format. | |||
pub(crate) fn source_name(&self, name: impl AsRef<Symbol>, loc: impl AsRef<Loc>) -> SourceName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expecting some other things that will be convertable to &Symbol
that might show up here? I realize AsRef<Loc>
was already fairly common in the code, and AsRef<str>
is very useful in Rust, but do we anticipate other implementations that would warrant slightly more complex code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its really just about references, and we should have used it more frequently. With this you are able to pass parameters without *
or &
. We often need to do *
because in matches, we get &Symbol
. Similar for locations, we often need to add &
. This reduces the noise a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have some comments, but none of them blocking (but would be good to address/answer). I did not check the test outputs for the move prover, so someone from the prover team might be better equipped to review these.
}; | ||
use std::collections::{BTreeMap, BTreeSet}; | ||
|
||
pub struct FunctionGenerator<'a> { | ||
/// The underlying module generator. | ||
gen: &'a mut ModuleGenerator, | ||
/// The set of temporaries which need to be pinned to locals because references are taken for | ||
/// them. | ||
/// them, or they are used in specs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, this would mean that codegen quality could be affected solely by the presence of a use in specs (which is typically expected to be a no-op for compilation). Might be worth re-considering this as a future task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this was always the case, also in v1.
We could actually make this optional depending on whether we generate code for the prover or not, but the problem with this is that then we wouldn't verify the same bytecode as it is running in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not suggesting that we do this now, but another alternative is to transform the specs as we transform the normal code.
@@ -211,8 +240,22 @@ impl<'a> FunctionGenerator<'a> { | |||
fn pinned_temps(ctx: &FunctionContext) -> BTreeSet<TempIndex> { | |||
let mut result = BTreeSet::new(); | |||
for bc in ctx.fun.get_bytecode() { | |||
if let Bytecode::Call(_, _, Operation::BorrowLoc | Operation::Drop, args, _) = bc { | |||
result.insert(args[0]); | |||
match bc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the function doc (which says this is getting temps referenced in borrow instructions, but now it does more).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// locals. The spec block is then stored in the spec block table | ||
/// which will be written back to the function spec at the end of | ||
/// translation. In the actual Move bytecode, a `Nop` is inserted | ||
/// at the current code offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: why do we need to insert a no-op, instead of not emitting anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a convention the v1 compiler is using. It makes it easier to identify that this is a gluing point for specs and can be used e.g. in program analysis if some extra spec related info is generated for this program point.
let struct_handle = self.struct_index(ctx, loc, struct_env); | ||
let fields = struct_env | ||
.get_fields() | ||
.sorted_by(|a, b| a.get_offset().cmp(&b.get_offset())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From looking at the code for get_fields()
, it looks like they are already sorted based on their offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed.
assert!(!self.func_target.get_local_type(*src).is_reference()); | ||
assert!(!self.func_target.get_local_type(*dest).is_reference()); | ||
if self.func_target.get_local_type(*src).is_mutable_reference() { | ||
assert!(self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth stating that this assertion holds because we are supposed to have inserted freezerefs when converting from &mut
to &
everywhere already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I don't even know were this assumption comes from and whether it is needed (this is a duplicate of few lines below). In general I opened #12533 to follow up here.
@@ -511,7 +523,7 @@ pub fn add_move_lang_diagnostics(env: &mut GlobalEnv, diags: Diagnostics) { | |||
} | |||
|
|||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if we are still using something here that is deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should remove this deprecation, some guy added it years ago not knowing the prover needs it.
@@ -1,11 +1,5 @@ | |||
============ initial bytecode ================ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have not looked into why the ordering changed - maybe worth a brief look whether this is an acceptable change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also surprised why, but it doesn't hurt, so...
4 │ │ } | ||
5 │ │ } | ||
│ ╰─^ ICE failed bytecode verifier: VMError { | ||
3 │ x == y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have also fixed this bug: #11925. If you think so too, maybe close that bug through this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually not, it is only that error message is more precise because I now generate the full source map.
@@ -1012,7 +1012,7 @@ impl<'env, 'state> LifetimeAnalysisStep<'env, 'state> { | |||
fn check_borrow_safety(&mut self, temps_vec: &[TempIndex]) { | |||
// First check direct duplicates | |||
for (i, temp) in temps_vec.iter().enumerate() { | |||
if temps_vec[i + 1..].contains(temp) { | |||
if self.ty(*temp).is_mutable_reference() && temps_vec[i + 1..].contains(temp) { | |||
self.exclusive_access_direct_dup_error(*temp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a test case corresponding to this (eg, call to a function with multiple dup args that are non-mut references) has not been added, please do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, new test is duplicate_use.move
@@ -13,7 +13,7 @@ Error: Function execution failed with VMError: { | |||
sub_status: None, | |||
location: 0x42::test, | |||
indices: [], | |||
offsets: [(FunctionDefinitionIndex(0), 1)], | |||
offsets: [(FunctionDefinitionIndex(1), 1)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely okay, but good to double check if this ordering change is known and acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked its semantically the same, so at this point, I just let it go. If it is an issue like some hidden non-determinism we will detect soon anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not look at md
which is generated but at the Move source of the same name. There is a bug linked there already and a TODO (#12501).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment on the .move source, this here is generated.
This was a test which was borderline regards timeout, and clearly times out with v2, I deactivated for CI using the pragma as here, as we do in many cases.
}; | ||
use std::collections::{BTreeMap, BTreeSet}; | ||
|
||
pub struct FunctionGenerator<'a> { | ||
/// The underlying module generator. | ||
gen: &'a mut ModuleGenerator, | ||
/// The set of temporaries which need to be pinned to locals because references are taken for | ||
/// them. | ||
/// them, or they are used in specs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this was always the case, also in v1.
We could actually make this optional depending on whether we generate code for the prover or not, but the problem with this is that then we wouldn't verify the same bytecode as it is running in production.
@@ -35,6 +39,8 @@ pub struct FunctionGenerator<'a> { | |||
locals: Vec<Type>, | |||
/// A map from branching labels to information about them. | |||
label_info: BTreeMap<Label, LabelInfo>, | |||
/// A mao from code offset to spec blocks associated with them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -211,8 +240,22 @@ impl<'a> FunctionGenerator<'a> { | |||
fn pinned_temps(ctx: &FunctionContext) -> BTreeSet<TempIndex> { | |||
let mut result = BTreeSet::new(); | |||
for bc in ctx.fun.get_bytecode() { | |||
if let Bytecode::Call(_, _, Operation::BorrowLoc | Operation::Drop, args, _) = bc { | |||
result.insert(args[0]); | |||
match bc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
assert!(!self.func_target.get_local_type(*src).is_reference()); | ||
assert!(!self.func_target.get_local_type(*dest).is_reference()); | ||
if self.func_target.get_local_type(*src).is_mutable_reference() { | ||
assert!(self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I don't even know were this assumption comes from and whether it is needed (this is a duplicate of few lines below). In general I opened #12533 to follow up here.
}, | ||
AssignKind::Store => { | ||
AssignKind::Copy | AssignKind::Store => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is why I opened #12533.
@@ -519,6 +522,10 @@ impl Bytecode { | |||
| Bytecode::Nop(_) => { | |||
vec![] | |||
}, | |||
Bytecode::SpecBlock(_, _) => { | |||
// Specifications are not contributing to read variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but as long as we do not eliminate the temporaries, lifetime doesn't matter for the prover. We need to take care of pinning them similar as we do with borrows of locals. If one your optimizations needs this pinning, please let me know and we can have a followup PR to fix this.
@@ -511,7 +523,7 @@ pub fn add_move_lang_diagnostics(env: &mut GlobalEnv, diags: Diagnostics) { | |||
} | |||
|
|||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should remove this deprecation, some guy added it years ago not knowing the prover needs it.
/// the difference between targets and dependencies for verification. | ||
/// Those tools can temporarily set this to true. | ||
pub fn treat_everything_as_target(&self, on: bool) { | ||
*self.everything_is_target.borrow_mut() = on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already minimizing the tree in the module_builder somewhere. For the other issue see above.
@wrwg You probably already saw this, but it looks like "Aptos Move Test for Compiler V2" is failing (and it is not required, so easy to see past it). |
Yes, thank you @vineethk . There were a few tests failing after rebasing with various optimizations on. As I guessed, pinning spec locals for variable coalescing and considering for lifetime in the case of dead store elimination needed to be done. If you want, please check the last commit, as it effects some of the code you have written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments for the latest commit.
}; | ||
use std::collections::{BTreeMap, BTreeSet}; | ||
|
||
pub struct FunctionGenerator<'a> { | ||
/// The underlying module generator. | ||
gen: &'a mut ModuleGenerator, | ||
/// The set of temporaries which need to be pinned to locals because references are taken for | ||
/// them. | ||
/// them, or they are used in specs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not suggesting that we do this now, but another alternative is to transform the specs as we transform the normal code.
for (idx, _) in exp.used_temporaries(self.func_target.global_env()) { | ||
Prop(id, _, exp) if self.track_all_usages => { | ||
for idx in exp.used_temporaries() { | ||
state.livevars.insert(idx, self.livevar_info(id, offset)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be state.insert_or_update
so that we join rather than replace (which is what insert
would do)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -217,7 +217,7 @@ impl FunctionTargetProcessor for AvailCopiesAnalysisProcessor { | |||
return data; | |||
} | |||
let target = FunctionTarget::new(func_env, &data); | |||
let analysis = AvailCopiesAnalysis::new(target.get_borrowed_locals()); | |||
let analysis = AvailCopiesAnalysis::new(target.get_pinned_temps(false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search and replace borrowed
with pinned
in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This concludes connecting the Move prover to the bytecode generated by the v2 compiler (modulo some remaining issues for which bugs have been opened). - Generates full SourceMap as the prover needs it for error messages - Generates new `SpecBlock` stackless bytecode instruction for specification blocks, and processes them such that they can be associated by the prover with the bytecode - Plumbs the CLI `--compiler-version v2` and `MOVE_COMPILER_V2` env var through the spaghetti of aptos cli and Move package system such that v2 compiler is called for model building - Adds a new entry point to the compiler `run_move_compiler_for_analysis` which builds a model with the v2 compiler for tools like the prover. Also refactors a few other entrypoints in the compiler. - Adds a new flag to prover driver `--compiler-v2` to for direct prover invocation. Also adds `--aptos` for adding Aptos specific prelude in standalone command line mode. - Fixes a bug in the borrow analyis (but also see #12533) - Adds some `verify_duration_estimate` pragmas to the framework spec as v2 prover times out (this might be a result of more complex code generated which we should fix down the road with optimizations) - Adds a new test feature to the prover unit tests such that all of them run both against v1 and v2. The v2 exp files are different if it comes to backtracs, which is to be expected with different generated code, but have been compared 1:1 to check whether they are equivalent - Fixes a small bug in reference-safety discovered by the prover tests
- Needed to pin temps for variable coalascing. This has been done by making `get_pinned_temps` a shared function between coalascing and function generator - Needed to consider spec temp usages if `track_all_targets` is set in LiveVarAnalysis. This is needed so dead store elimination does not remove assignments used in specs. - Renamed `used_temporaries` to `used_temporaries_with_types` and let the original function work without global env and delivering types, so it can be used in places were no env is available. - Make `MOVE_COMPILER_V2` turn on compiler v2 also for prover runs - Extended `move_pr.sh` with a new option `-2` which makes it run all integration tests v2, as we do on github
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
This concludes connecting the Move prover to the bytecode generated by the v2 compiler (modulo some remaining issues for which bugs have been opened).
SpecBlock
stackless bytecode instruction for specification blocks, and processes them such that they can be associated by the prover with the bytecode--compiler-version v2
andMOVE_COMPILER_V2
env var through the spaghetti of aptos cli and Move package system such that v2 compiler is called for model buildingrun_move_compiler_for_analysis
which builds a model with the v2 compiler for tools like the prover. Also refactors a few other entrypoints in the compiler.--compiler-v2
to for direct prover invocation. Also adds--aptos
for adding Aptos specific prelude in standalone command line mode.verify_duration_estimate
pragmas to the framework spec as v2 prover times out (this might be a result of more complex code generated which we should fix down the road with optimizations)