-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
itest: add optional 10k asset test #325
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I like the idea with the two test lists, probably a bit easier to understand (and also to implement) than the build tag variant.
Another motivating reason was that trying to use mutually exclusive build tags to swap out the test list caused some IDE issues with gopls, which could cause issues later on. |
Updated with a minimal batch minting stress test, which reliably reproduces the first perf. issue mentioned in #314. With a batchSize of 100, minting succeeds, and we can also reproduce #313. TODO list:
|
5fb7f89
to
9597bc0
Compare
@guggero: review reminder |
8ccece5
to
40073b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool test, great to know we can mint that many assets in a decent time. And also great to have found some performance tweaks along the way with this!
I pushed up a couple of fixes and confirmed the 1k test succeeds locally on my machine both with SQLite and Postgres. The 10k test still takes too long to finish within the 60m test timeout, so I think we first need to get some of the optimizations in. So there are multiple fixes in here that address #361. |
tapdb/sqlc/queries/universe.sql
Outdated
FROM universe_leaves leaves | ||
JOIN genesis_info_view gen | ||
ON leaves.asset_genesis_id = gen.gen_asset_id | ||
WHERE gen.asset_id = @asset_id | ||
RIGHT OUTER JOIN key_group_info_view groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think we can simplify this, and just target the group key directly.
Retracing a bit of history here, the query first looked like this: https://github.com/lightninglabs/taproot-assets/blame/c037025ebb22c54087736589bd31fb570a04b738/tapdb/sqlc/queries/universe.sql#L101-L105
The issue with that is that the asset_id
field there for assets with a group key is basically whichever asset was inserted first. As a result, if you tried to log the sync of one of those leaves, then things fail as maybe that was the wrong leaf.
We then changed it to target the leaf instead (since we know the asset id of the leaf), then use that to get the pointer ID back to the root.
However if we have the group key, then we can just target that diretly. So we can have a query that looks similar to the old one:
SELECT id
FROM universe_roots
WHERE group_key = @group_key
So with this, we don't need to modify that other view, we can just combine the queries and pick whichever doesn't return NULL
SELECT COALESCE(
(SELECT leaves.universe_root_id AS id
FROM universe_leaves leaves
JOIN genesis_info_view gen
ON leaves.asset_genesis_id = gen.gen_asset_id
WHERE gen.asset_id = @asset_id),
(SELECT id
FROM universe_roots
WHERE group_key = @group_key)
) AS id;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! I think that did the trick (I went with a slightly different approach since I'm not sure how COALESCE behaves if there are no results or multiple results returned by a sub query). But the idea is the same.
In this commit, we add additional make commands to run optional itests, where specific test cases are still specified with the 'icase' variable. Additional optional itests are added in the same way as default itests. We also remove the old-style build tags that were replaced with Go 1.17.
In this commit, we update the logic used in all itests to mint assets to reduce the number of RPC calls made and total time spent on asserts. We also separate waiting for the planter to change state and checking the daemon state.
LGTM but not approving b/c original author. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work guys!
Because the String() function just returns the hash of the ID, it is hard to debug whether both the asset ID and group key are set during universe sync. This commit adds a bit more information to certain log lines.
In this commit, we add a test that mints a large batch of collectibles with large metadata, and then checks that the batch mint succeeded. This includes correctly updating the universe server of the minting node, and syncing that universe tree to a second node.
If SQLITE_BUSY is returned, it means waiting for a transaction failed because too many other readers/writers are currently using the DB. We can detect that error, convert it into a serialization error and detect that in the existing retry loop. We also re-try if creating or committing a transaction fails with a serialization error.
// operations to prepare and validate the keys. But since we're now | ||
// actually starting to write, we want to limit this to a single writer | ||
// at a time. | ||
b.registrationMtx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should revert this change. We want to make sure that our db concurrency control can handle situations like this. For sqlite, there's only a single writer. For postgres, the new serialization mode should detect conflicts, then abort so it can retry safely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change we ran into "number of retries exhausted" errors quite quickly, both on SQLite and Postgres. That's mainly because we constantly sync multiple leaves at a time (number of CPUs in parallel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, perhaps we need to increase either the back off timeout, or the number of attempts (I just kinda chose 10 randomly). Wanting to try to tune these params, as otherwise we may end up hitting this in other areas that have more involved db transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like batching these transactions would also work? E.x. commit batches of 5 issuance proofs vs. single proofs.
IIUC in all cases where we hit this issue we also don't need to write these proofs individually to recover from a restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that'll work too, rn we just do them one by one, but we can also batch the insertion as well. We can easily batch when adding to our own universe (that for loop in the planter), and then for RPC, we can let it be a repeated
field.
Adds support for optional itests, starting with a stress test of the minter with a batch of 10,000 assets, each with 4kB of metadata.
Fixes #361.