-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented account state deltas storage and endpoint for getting account state deltas #418
Conversation
… account state deltas chore: migrate to the latest `next` of `miden-base`
# Conflicts: # Cargo.lock # Cargo.toml
d96b222
to
3b01b3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Thank you! I left a few comments inline - the main ones about figuring out whether we actually want to have both sides of the range to be inclusive, or if the from
part should be exclusive.
In general, I think using block_num
to define the range works better than using nonces for this - but would be good for @igamigo to confirm if this is fine on the client side.
proto/requests.proto
Outdated
message GetAccountStateDeltaRequest { | ||
account.AccountId account_id = 1; | ||
fixed32 from_block_num = 2; | ||
fixed32 to_block_num = 3; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some comments here - especially to clarify whether from/to ranges are inclusive or exclusive. I'm thinking that from_block_num
should be exclusive but to_block_num
should be inclusive. That is, if you get the delta, the account state should be the same as it was after to_block_num
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to depend on semantics and expectations and there are many different ways to interpret this. As a caller I might expect from..to
to be the deltas between my current state in from
to get to the state in to
i.e.
deltas = state[to] - state[from]
crates/store/src/state.rs
Outdated
/// Returns the state delta between `from_block` and `to_block` for the given account. | ||
pub(crate) async fn get_account_state_delta( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to one of the previous comments: here and in other similar places we should clarify whether from/to values are inclusive or exclusive.
I want to add, that we might want to use nonces in order to enforce limits to endpoint for current implementation. Our current implementation is great in terms of space consumption and it's pretty fast to get needed deltas, but for some accounts it might be needed to get enormous number of deltas in order to sync to the latest state. Obvious solution is to limit number of deltas processed by one single call to the endpoint. For cases, when deltas number is too hight it might be better to just request account's full state. I see two solutions for it:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some sql comments, but looks good!
|
||
PRIMARY KEY (account_id, block_num), | ||
CONSTRAINT fk_block_num FOREIGN KEY (block_num) REFERENCES block_headers(block_num) | ||
) STRICT, WITHOUT ROWID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be careful using WITHOUT ROWID
as it does have performance implications:
WITHOUT ROWID tables work best when individual rows are not too large. A good rule-of-thumb is that the average size of a single row in a WITHOUT ROWID table should be less than about 1/20th the size of a database page. That means that rows should not contain more than about 50 bytes each for a 1KiB page size or about 200 bytes each for 4KiB page size. WITHOUT ROWID tables will work (in the sense that they get the correct answer) for arbitrarily large rows - up to 2GB in size - but traditional rowid tables tend to work faster for large row sizes. This is because rowid tables are implemented as B*-Trees where all content is stored in the leaves of the tree, whereas WITHOUT ROWID tables are implemented using ordinary B-Trees with content stored on both leaves and intermediate nodes. Storing content in intermediate nodes causes each intermediate node entry to take up more space on the page and thus reduces the fan-out, increasing the search cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good gut feel for sizes yet; but I imagine most non-trivial delta's will be larger than 200 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will remove "WITHOUT ROWID" for at least tables with integer primary key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left just one tiny nit inline.
proto/requests.proto
Outdated
// Returns delta of the account states in the range from `from_block_num` (exclusive) to `to_block_num` (inclusive). | ||
message GetAccountStateDeltaRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: we should wrap comments after 100 chars.
I'm leaning toward the first approach. In general, I think we could introduce "epoch-based pagination" I described in #404 (comment) to a couple of endpoints and this would be one of them. We should also "stress-test" at some point to see what happens if the response is too big. The default Tonic message size is 4MB. Would be good to find out what happens if we exceed this limit (i.e., what will break and what kind of an error message we'll get). Let's create an issue for this. |
# Conflicts: # CHANGELOG.md
Added issue: #422 |
Resolves #290
We decided to implement delta-based storage for public accounts (full account state for the latest state and deltas for all updates except creation of new account). This is needed for
GetAccountStateDelta
endpoint which is also implemented in this PR.The only question I still have, if it enough to use block numbers for tracking account state updates on client or we need to use account nonces for that.