-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement tracking of top-N online accounts in the ledger #1361
Conversation
0556c84
to
4aae498
Compare
(Force-pushed to rebase against master, which had some conflicting changes.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if the solution wouldn't be much simpler if we were to extend the accountbase
by adding another optional column onlineaccountnormalizedbalance
and having a single partial index for these ?
I was initially worried that the full account state might be pretty large, compared to just the state needed for my purpose (online balance and the voting key), which is why I materialized a separate table to cache top online accounts. But, I do like the simplicity of just adding a column to So, I'll get rid of the separate table, and revise this PR accordingly. Thank you for the suggestion! |
Instead, add the normalized balance as a separate column to the existing accountbase table, and construct an index on that table.
I pushed a revised version. It ends up being a little shorter, but mostly because |
func accountsAddNormalizedBalance(tx *sql.Tx, proto config.ConsensusParams) error { | ||
var exists bool | ||
err := tx.QueryRow("SELECT 1 FROM pragma_table_info('accountbase') WHERE name='normalizedonlinebalance'").Scan(&exists) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the database versioning, you dont really need to verify that the column present/missing, but it wouldn't be incorrect either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out to be convenient because in the accountdb test code, I don't want to be invoking acctupdates code, but I still need some way to add the normalizedonlinebalance column. Without this check, I would need some other way to determine if this column has been added or not.
ledger/accountdb.go
Outdated
var accountsResetExprs = []string{ | ||
`DROP TABLE IF EXISTS acctrounds`, | ||
`DROP TABLE IF EXISTS accounttotals`, | ||
`DROP TABLE IF EXISTS accountbase`, | ||
`DROP TABLE IF EXISTS onlineaccountbase`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line can be dropped, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, good catch.
thanks. I think it looks much better, and more imporrant : will perform much better on a high tx rate. |
Resolve conflict in ledger/accountdb_test.go
In onlineTop, why don’t you cache the most recent modification to any account, i.e.
and use these in order to update the candidates ? or maybe I'm missing something ? i.e. either way you want to get the "latest" state for an address. |
That seems like a reasonable optimization. Implemented and pushed. |
The changes that you've made so far looks good, but we also need to have the corresponding catchpoint tables changes. The As is, there are few missing changes:
The above does looks like lots of changes, but it really does boil down to handful of lines. Beside these, I think that the code looks good. |
btw - regarding your idea to materialize some of the accounts onto a separate table : |
@tsachiherman , thanks for the pointers to where these normalized online balances need to be added for catchpoints. I pushed the changes to support that; it'd be great if you can take a look when you have time, to see if I missed something. One difference from what you were proposing to do lies in how indexes are handled. When a table is renamed, SQLite actually moves the indexes along with the table. So, there's no need to explicitly drop indices. However, we do need to create an index for the |
Thanks; I won't be able to take a look of these changes this week, but will do so right after the release. |
@@ -136,60 +148,95 @@ func writeCatchpointStagingCreatable(ctx context.Context, tx *sql.Tx, addr basic | |||
return nil | |||
} | |||
|
|||
func writeCatchpointStagingBalances(ctx context.Context, tx *sql.Tx, bals []encodedBalanceRecord) error { | |||
insertStmt, err := tx.PrepareContext(ctx, "INSERT INTO catchpointbalances(address, data) VALUES(?, ?)") | |||
func writeCatchpointStagingBalances(ctx context.Context, tx *sql.Tx, bals []encodedBalanceRecord, proto config.ConsensusParams) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty confident that this isn't what you want to do:
Using the genesis config.ConsensusParams here means that accounts that would be using fast catchup would calculate the normalized online balances based on the genesis params and not by the most recent params.
I would suggest you'll:
- add the calculated normalized online balances to the catchpoint file
- "restore" it in the same way, without recalculating anything.
- add the calculated normalized online balances to the hashing of the account, so that the merkletrie would include that.
- update the version of the catchpoint file, so that it would "break" compatibility with older files.
I have yet had the need to perform #4, so I don't know how hard it would be. ( shouldn't be too hard ).
I will be available to help you with this one if you'd like next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proto
is used only for the value of proto.RewardUnit
, which cannot change in a given network, so it's OK to pass the proto for the genesis block or the proto for a recent block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we should always use the genesis round proto in the call to
err = accountsNewRound(tx, deltas[i], creatableDeltas[i], protos[i+1])
as well.
since we don't want to make the assumption that "it won't change"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a reasonable idea; thanks! Pushed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me.
This PR extends the ledger's account tracking to be able to efficiently get the top-N online accounts, for any round between the account base round and the latest round.
This will be used in a subsequent PR to periodically commit to a Merkle array of the top-N online accounts in the block header.
The most annoying complication in this PR has to do with our rewards scheme. When you take rewards into account, then it's possible for there to be two accounts, A and B, where the balance of A is larger than B at some round R, and then the balance of A becomes smaller than the balance of B at a later round R'>R, even though no transactions executed on either account A or B.
This PR deals with the above complication by using a notion of a "normalized" balance, which is a hypothetical number of microalgos your account should have had at round 0 to accrue your current balance by the round in which your account was last modified. Comparisons of these normalized balances are stable over time, and thus give us a deterministic way to compute the top-N accounts, so that all nodes agree on what they are.