-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Speed up ATX queries #5791
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5791 +/- ##
=========================================
- Coverage 80.4% 80.3% -0.1%
=========================================
Files 283 284 +1
Lines 29265 29497 +232
=========================================
+ Hits 23539 23705 +166
- Misses 4134 4172 +38
- Partials 1592 1620 +28 ☔ View full report in Codecov by Sentry. |
bors try |
sql/atxs/atxs.go
Outdated
where a1.pubkey = a.pubkey and a1.epoch < a.epoch and a1.nonce is not null | ||
order by a1.epoch desc | ||
limit 1 | ||
)) as nonce, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is also possible to copy nonce from the previous atx on insertion, it will add very little data but completely avoid this subqueries, and also this bug #5701
// func(stmt *sql.Statement) { | ||
// stmt.BindInt64(1, int64(from.Uint32())) | ||
// stmt.BindInt64(2, int64(to.Uint32())) | ||
// }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess it means that indexes are not useful for this query at all, and reading whole table is faster.
i tried to decouple blobs from the rest sometime ago #5670
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although splitting the blobs did improve performance of multiple ATX queries substantially, in this particular case it's still more efficient not to use this extra where
condition
What is the difference between |
the second is used in api and allows a bit of programmability, the first one is used in warmup.go (which is a subjet of this change) |
tryBuild succeeded: |
There is also I know cc @poszu |
unlike the other two IterateForGrading selects unique piece of information that is not used in those other two. it would be wasteful to always do a join and select that piece of information in other queries. also in this change there is another part that selects nonce which wouldn't make sense to use in IterateForGrading, and it certainly would be problematic to add it to IterateOps. it doesn't sound like a good idea to break down atx into multiple fields and store all of them separately. maybe you can try to keep blobs as they are. and save subset of fields that are relevant for consensus from atx with version 2, in a way similar to current schema. blob is stored in original form so that it can be fetched in sync without re-encoding. if you break down atx into multiple fields it will make it less efficient. the rest of the fields are decoupled and used only for consensus and sometimes for indexing. which is still imperfect and i think better solution would be something like in #5670 |
@dshulyak the goal is to keep only the minimum necessary information in individual columns in the table. The main problem with relying on decoding ATXs when fetching them from the DB is, that we then need to put logic into the SQL package to be able to decode different blobs of ATXs based on their publish epoch. In my opinion this should be the sole responsibility of the For gossip / fetch the handling functions don't need to be aware how the blob data encodes an ATX but just serve / gossip it if needed. As fallback one can use the ATX ID to fetch the binary blob and decode the ATX themselves if needed somewhere, with the caveat that they then need to be aware of possible different encoding schemas of ATXs. Right now we have multiple ways of representing the same data structure:
With some of them performing double duties and - at least for the first 3 - tight coupling between them. For me ideally we only had 2 types:
Where the first contains all the information relevant for consensus and in the best case maps fields 1:1 to the columns in the atx table in the DB and the second is the representation of the ATX as it is sent over the wire and should ONLY be used for that. The first can be used anywhere and be freely extended or changed without having to fear to break backwards compatibility. Queries would always return an I'm only advocating for a leaner surface to the I've seen #5670 but I don't fully understand how this improves performance? Why does it matter if the blob data is in the same or a different table? I'm not advocating against it. It could even help keeping the separation as suggested clear in the code. Would moving it to a different table not cause functions like |
Probably simplifying somewhat, but #5670 should improve SQLite performance b/c with blobs taken out of |
Also, vacuum the database as soon as needed (possibly between migrations)
It makes no sense to include the VRFNonce in every published ATX. For any given identity it usually never changes. If we wouldn't support PoST resizing we would only ever need to include it in the initial ATX (similar to CommitmentATX). But since an existing nonce can become invalid if the user chooses to increase their PoST size we must have an option to update the VRFNonce of an identity. If you want to simplify the lookup of the nonce instead we could make the |
@fasmat ok I'll adjust the patch not to broadcast the nonce, yet we need to include it in the |
I see no issue in adding the value to the DB 🙂, I just want to avoid having to broadcast the nonce with every ATX. |
|
This func is only to be used in tests
Remove dangling map (only needed for few equivocating ATXs)
@@ -0,0 +1,366 @@ | |||
package migrations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should probably be a 0017_xyz.sql
empty migration file (with a comment) to avoid somebody creating it by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments in the scripts appear to cause our database code to crash (probably will need to fix this separately), so added an empty placeholder file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have comments in .sql
files, see for instance here:
--- sqlite doesn't support just adding a NOT NULL constraint, so we create a new column, | |
--- copy the data, drop the old table, and rename the new table to the old name |
bors merge |
## Motivation Lots of SQL queries to the state db take long time to execute, e.g. cache warmup is taking a lot of time
Pull request successfully merged into develop. Build succeeded: |
## Motivation The `nonce` column in the `atxs` table has been populated since migration 17 (#5791). Therefore the related field in the domain type `types.ActivationTx` can be required. Updating to this version requires going through version 1.5 first.
## Motivation The `nonce` column in the `atxs` table has been populated since migration 17 (#5791). Therefore the related field in the domain type `types.ActivationTx` can be required. Updating to this version requires going through version 1.5 first.
Motivation
Lots of SQL queries to the state db take long time to execute, e.g. cache warmup is taking a lot of time
Description
This PR splits ATX blobs into a separate table (
atx_blobs
).It also adds
nonce
column to each row in theatxs
table, improving the query performance.The database upgrade procedure had to be modified so that each migration is done in a separate transaction, and vacuuming can be done between migrations if requested by the
db-vacuum-state
setting. This may seemingly cause "partial" upgrades when switching to a newer go-spacemesh version, rendering the database unusable with older versions, but given that the local DB is updated separately, the data directory can't be expected to be usable with an older version after such an attempt anyway.Timings on an M3 Max Mac:
atxs
table 11mFixes #5701
Test Plan
Verified on a mainnet node.
TODO