Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Provide read-only access to storage during tracing #9244

Closed

Conversation

udoprog
Copy link
Contributor

@udoprog udoprog commented Jul 28, 2018

I'm putting together a debugger based on Parity and need access to storage during tracing to be able to decode solidity variables.

@parity-cla-bot
Copy link

It looks like @udoprog signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

@udoprog udoprog force-pushed the tracing-storage-access branch from 9bde809 to afd65e8 Compare July 28, 2018 09:15
@udoprog
Copy link
Contributor Author

udoprog commented Jul 28, 2018

Gonna fix the error when I have a moment. I realized I also have to add another hook for when the execution context of a contract is done that is not drain.

@udoprog udoprog force-pushed the tracing-storage-access branch 3 times, most recently from cb8714f to 03ee0b6 Compare July 29, 2018 06:57
@5chdn 5chdn added this to the 2.1 milestone Jul 29, 2018
@5chdn 5chdn added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Jul 29, 2018
@sorpaas
Copy link
Collaborator

sorpaas commented Aug 28, 2018

@udoprog Can you explain more on the use-case here? If you are doing a normal tracing via full ethcore, I think the storage should already be "accessible" through the store_diff parameter. You maintain a hashmap, and move values into the hashmap as store_diff comes in. If you want more fine grained control, you can directly use evm::Interpreter::step and inspect the full vm state at each opcode.

Would this cover your use-case?

@udoprog
Copy link
Contributor Author

udoprog commented Aug 28, 2018

@sorpaas IIUC, store_diff only includes storage that is changed. At least my tests seem to indicate this assuming I did it correctly. This does not include persistent storage across subsequent contract invocations.

Is this correct?

@sorpaas
Copy link
Collaborator

sorpaas commented Aug 28, 2018

Yes. store_diff only includes changed storage. If you want to access other storage slots, would this work for you:

  • Before execution, call Client::state_at to grab another state handle.
  • When you need the data, call State::storage_at to get the original storage value.

@udoprog
Copy link
Contributor Author

udoprog commented Aug 28, 2018

@sorpaas what do you mean with "before execution". Can you show what you mean in code?

Note that StorageAccess as introduced here provides uniform access to both committed and uncommitted storage as a contract is being executed.

I think what you propose would require me to keep a reference to state_db around for when other contracts are called, get a snapshot of storage as it was before the transaction, and overlay any modifications I observe through store_diff. Is that correct?

It might be difficult to share state_db during a trace since I think it's already mutably borrowed to run the transaction. But I would have to check that.

Overall it seems a fair bit harder to get right.

@5chdn 5chdn modified the milestones: 2.1, 2.2 Sep 11, 2018
@5chdn
Copy link
Contributor

5chdn commented Sep 27, 2018

please rebase on master

@udoprog udoprog force-pushed the tracing-storage-access branch from 03ee0b6 to 2144b78 Compare September 27, 2018 11:28
@udoprog
Copy link
Contributor Author

udoprog commented Sep 27, 2018

@5chdn done

@5chdn 5chdn requested review from debris and ordian September 29, 2018 21:10
@5chdn
Copy link
Contributor

5chdn commented Sep 30, 2018

Can you have a look at this @debris @ordian :)

ordian
ordian previously approved these changes Oct 1, 2018
Copy link
Collaborator

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me, but I haven't really touch this code, @sorpaas should know better.

@@ -148,7 +148,7 @@ impl<Trace: Writer, Out: Writer> trace::VMTracer for Informant<Trace, Out> {
fn trace_prepare_execute(&mut self, _pc: usize, _instruction: u8, _gas_cost: U256) {
}

fn trace_executed(&mut self, _gas_used: U256, stack_push: &[U256], _mem_diff: Option<(usize, &[u8])>, store_diff: Option<(U256, U256)>) {
fn trace_executed(&mut self, _gas_used: U256, stack_push: &[U256], _mem_diff: Option<(usize, &[u8])>, store_diff: Option<(U256, U256)>, storage: &storage::StorageAccess) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_storage

/// Trace the end of a subtrace.
///
/// After this, no more instructions will be executed for this subtrace.
fn trace_done(&mut self, _storage_access: &StorageAccess) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on naming, but we already have done_subtrace, so maybe trace_finalized?

@@ -100,7 +100,7 @@ impl trace::VMTracer for Informant {
self.gas_cost = gas_cost;
}

fn trace_executed(&mut self, gas_used: U256, stack_push: &[U256], mem_diff: Option<(usize, &[u8])>, store_diff: Option<(U256, U256)>) {
fn trace_executed(&mut self, gas_used: U256, stack_push: &[U256], mem_diff: Option<(usize, &[u8])>, store_diff: Option<(U256, U256)>, storage: &storage::StorageAccess) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_storage

@5chdn 5chdn added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Oct 1, 2018
@sorpaas
Copy link
Collaborator

sorpaas commented Oct 1, 2018

@ordian We don't need this tracer anywhere for ourselves, and it's not exposed in RPC. So this is solely about using ethcore as a library and specific to @udoprog's usecase. I still believe that this can be done by just directly getting a State reference, but probably indeed it adds more convenience for what @udoprog tries to do.

@sorpaas
Copy link
Collaborator

sorpaas commented Oct 1, 2018

@udoprog Sorry I realized that I missed your last message!

I think what you propose would require me to keep a reference to state_db around for when other contracts are called, get a snapshot of storage as it was before the transaction, and overlay any modifications I observe through store_diff. Is that correct?

It might be difficult to share state_db during a trace since I think it's already mutably borrowed to run the transaction. But I would have to check that.

Sharing the reference would indeed be hard. However, if your usecase is debugging and you can bare a little bit of performance, you can clone it! State struct implements Clone, and once cloned, it's completely detached from any modifications from the other state.

@udoprog udoprog force-pushed the tracing-storage-access branch from 2144b78 to 3804196 Compare October 2, 2018 00:06
@udoprog
Copy link
Contributor Author

udoprog commented Oct 2, 2018

@sorpaas OK! I tried cloning, and with the current amount of storage that I'm using in tests it seems to work fine. I haven't tried overlaying the changes, but all the APIs are there, so I don't think that should be a problem.

I'm a bit concerned that I have to do it for every call, and that might not scale long term if I ever build tests with a bit more state. So I'd still be interested to do it without cloning at some point.

@udoprog udoprog force-pushed the tracing-storage-access branch from 3804196 to ffc2dc9 Compare October 2, 2018 01:18
@sorpaas
Copy link
Collaborator

sorpaas commented Oct 2, 2018

@udoprog It's actually not as expensive as you think. State initially are really small, containing only the storage root and other relevant info. Caches of account and storage items are only built up when accessing fields. What's more, when fetching info from the disk, State will check for a global cache, which is shared.

As a result, for your usecase, if you clone State before VM execution, it should scale well.

Using Arc to wrap checkpoints on State would permanently solve the reference/cloning issue you're facing (and it also fixes #9319 (comment)), but we probably don't have a date for that right now. (It's a big refactoring!)

@udoprog
Copy link
Contributor Author

udoprog commented Oct 4, 2018

@sorpaas Not sure I follow. This is how I currently set up storage, so I'm not sharing anything between my VM instances so how would caching work? Cloning has to give me access to an isolated instance of state in which I overlay mutations. At the very least it needs to do bookeeping to separate the underlying state for each instance in case they are mutated right?

I'm trying to do what you've proposed so far, but it's turning it to be a fair bit of extra bookkeeping and synchronization I have to do for my tracers. It would be really helpful to have something like this instead.

I can fix the conflicts, but I'd like to check before if this is even something you want to merge?

@sorpaas
Copy link
Collaborator

sorpaas commented Oct 8, 2018

@udoprog I tried to look into the code but still don't yet understand the use case (probably I haven't looked this hard enough!). It would be great if you can further explain what you tries to do:

  • Do you want to print all changed storage values after each opcode? You don't need State reference for that -- just accumulate all changed values from the trace is enough.
  • If what you want to do is to print all storage values of the executing account -- the issue is that real-world storage can be really big, and for this, we probably need to explore more to support this use case.
  • If, however, what you want to do is to provide a debug interface -- interrupt execution after each opcode, allow the inspector to freely look into any storage slot it wants, then I would recommend to use evm::Interpreter::<U256>::step (you'll still need to get an Externalities from ethcore). This allows the execution to pause at each step, and you can freely inspect memory, storage, stack, or anything you would like!

@udoprog
Copy link
Contributor Author

udoprog commented Oct 8, 2018

@sorpaas I trace the execution of a contract, record each time storage is accessed. If a revert occurs, I lookup as much context as possible at the point it was reverted. That includes the value of most expressions that have been evaluated. If any of these expressions are references into storage, I lookup and decode their value from storage.

Right now I'm trying to rebase on the latest version of ethereum, and a fair bit of the tracing stuff has changed (e.g. subtracers no longer seem to be a thing).

This is a gif showcasing the state dumping in general: https://github.com/PrimaBlock/parables/blob/master/images/expression-dump.gif

That works because the access to storage is an immediate expression that I can decode, but if I for example have a loop I'm recording all storage accesses in order to dump that as part of the context. These storage slots then need to be decoded at a later time, by accessing storage directly.

@5chdn
Copy link
Contributor

5chdn commented Oct 26, 2018

@udoprog sorry, could you rebase this again?

@sorpaas @debris could you give this a final review?

@5chdn 5chdn modified the milestones: 2.2, 2.3 Oct 29, 2018
self.vm_tracer.trace_executed(gas_used, stack_push, mem_diff, store_diff)
let Externalities { ref state, ref mut vm_tracer, .. } = *self;

let access = ExtStorageAccess(*state, &self.origin_info.address);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to change to a design where if NoopTracer is used, we don't create this struct? For example, this can be done by directly passing State reference into tracer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is pairing the storage access with the address of the contract so that we don't have to do this, making the storage access API more convenient.

What are the worries about from setting up this struct? The overhead should be negligible since we are mostly dealing with pointers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like the overhead is negligible. However, I wonder about other tracer usage for ethcore-as-a-library. People may want to build other tracer implementation that access to other state information (like other account's storage slot, account code, nonce, etc). So I'm just thinking maybe directly passing State would be more generalized and potentially fit more usage?

@udoprog
Copy link
Contributor Author

udoprog commented Oct 30, 2018

@5chdn I'll rebase after it's been reviewed or not if it's rejected if that's fine by you?

I've already rebased a couple of times.

sorpaas
sorpaas previously approved these changes Oct 30, 2018
Copy link
Collaborator

@sorpaas sorpaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. Sorry for the delay.

Still have a question but otherwise, as the overhead is really small, I think it's fine to merge this. Because I still don't fully understand internals of how this tracer feature is used, I would still appreciate another review, or any other ideas how we can better support those ethcore-as-a-library use cases.

@sorpaas sorpaas added A0-pleasereview 🤓 Pull request needs code review. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Oct 30, 2018
@udoprog udoprog force-pushed the tracing-storage-access branch from ffc2dc9 to 78e7474 Compare October 30, 2018 22:48
@udoprog
Copy link
Contributor Author

udoprog commented Oct 31, 2018

I don't think this applies cleanly since 1e9aebb

If I'm reading that correctly, calls can now be suspended and resumed (the details of which are not entirely clear to me yet). When this happens a new externality is created and finalized. This messes with the trace_finalized call introduced here since it expected the ext to live for the duration of a single call.

done_subtrace seems to be the correct hook to use, but that's not called through an ext any longer. But it does have access to the state so introducing storage here seems doable. But it's not something that can be suitable achieved through a rebase.

Does this make sense?

@sorpaas sorpaas dismissed their stale review October 31, 2018 03:52

Needs rebase, and minor interface change.

@sorpaas
Copy link
Collaborator

sorpaas commented Oct 31, 2018

@udoprog Yeah. It's possible to get the same interface as of now, but also not sure if it suits you need still -- there's no concept of subtracers any more.

@sorpaas
Copy link
Collaborator

sorpaas commented Oct 31, 2018

done_subtrace should be fine to introduce State there if you're sure that suits your use case!

@sorpaas sorpaas added A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. and removed A0-pleasereview 🤓 Pull request needs code review. labels Oct 31, 2018
@5chdn
Copy link
Contributor

5chdn commented Nov 25, 2018

Are you still working on this @udoprog ?

@5chdn 5chdn added A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. and removed A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. labels Nov 25, 2018
@udoprog
Copy link
Contributor Author

udoprog commented Nov 26, 2018

@5chdn No, unfortunately I currently don't have the time to familiarize myself with the changes done while this was pending. Either someone else can pick it up, or feel free to close it and I'll open a new one if I end up finding the time.

@5chdn
Copy link
Contributor

5chdn commented Nov 27, 2018

Ok, no worries.

@5chdn 5chdn closed this Nov 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants