Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

tkporter · 2024-02-15T14:30:19Z

Atm, in a few places places we use the DB as shared memory to communicate between tasks. Sometimes this makes sense, like when building the in-memory merkle tree, which requires starting from leaf 0, then leaf 1, leaf 2, etc. But this doesn't make as much sense for the MessageProcessor, which doesn't necessarily need to start considering messages from the beginning of time. We index messages in a forward-backward manner so that we learn about new messages immediately upon starting up, but the relayer isn't able to deliver any of them until the backward indexer finishes because of the current MessageProcessor logic

Concretely we should:

See if there are any other spots where we do this "start from 0, loop through the db looking for the next one" logic where it doesn't make sense to
Change MessageProcessor or any other tasks where it makes sense to change to not rely on nonces from 0

Ideas for how we can change things:

Instead of reading from the DB by nonce starting from 0, for all indexed messages we also add an "insertion index" starting from 0. The very first message ever indexed (regardless of the nonce) is 0, the second 1, etc.
a. This requires a change to the local DB schema, and likely some migration logic so that we don't need to index all messages from scratch :(
Forward-backward style reading through the DB - this way we prefer newer messages
Maybe there's some way to iterate through RockDB data in a way that sorts by insertion date?
a. There's probably a cursed way of doing something like this. RocksDB has sequence numbers that it assigns to inserted data. It's possible to use DB::get_updates_since which gives a DBWALIterator that lets you iterate through all batches of write operations since a provided sequence number.
- Cursed because we'll be reading through every single write operation, and I have a feeling we're not really meant to be iterating through a WAL lol
- We'll probably need a different way to iterate through the DB upon startup. Iterating through RocksDB given a prefix is a seemingly well handled use case so we can do this easily
Following Hook indexing learns about new messages as they are indexed #3267, we'll have broadcast channels that we'll send each newly indexed message through. We could use this to have our message processing tasks learn about new messages that have just been indexed. Upon startup, for messages that aren't newly indexed, we can iterate through RocksDB by prefix to find all the already-indexed messages

The text was updated successfully, but these errors were encountered:

daniel-savu · 2024-05-27T10:55:48Z

update: the iteration strategy of the message processor has been changed to forward-backward in #3775

I believe the only change still required is to change the iteration method for IGP as well, which will be possible once #3267 is rolled out

tkporter · 2024-07-01T13:22:31Z

I'm gonna close this in favor of #3885

tkporter added the agent label Feb 15, 2024

github-project-automation bot added this to Hyperlane Tasks Feb 15, 2024

tkporter mentioned this issue Feb 15, 2024

Spec out an agent indexing refactor epic #3224

Closed

avious00 mentioned this issue Feb 17, 2024

Epic: Refactor Agent Indexing (Mar/Apr '24) #3281

Open

avious00 moved this to Next Sprint in Hyperlane Tasks Feb 17, 2024

avious00 added the indexing label Feb 17, 2024

avious00 assigned daniel-savu Mar 2, 2024

tkporter mentioned this issue Mar 13, 2024

Epic: Refactor Agent Indexing: better indexing positioning, relayer fast-delivery #3414

Open

tkporter moved this from Next Sprint to Sprint in Hyperlane Tasks Mar 27, 2024

avious00 moved this from Sprint to Next Sprint in Hyperlane Tasks Mar 27, 2024

cmcewen moved this from Next Sprint to In Progress in Hyperlane Tasks May 22, 2024

cmcewen moved this from In Progress to Next Sprint in Hyperlane Tasks May 22, 2024

daniel-savu moved this from Next Sprint to Sprint in Hyperlane Tasks May 27, 2024

tkporter closed this as completed Jul 1, 2024

github-project-automation bot moved this from Sprint to Done in Hyperlane Tasks Jul 1, 2024

tkporter mentioned this issue Jul 1, 2024

Relayer startup is fast even with empty db #3885

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

tkporter commented Feb 15, 2024 •

edited

Loading

daniel-savu commented May 27, 2024

tkporter commented Jul 1, 2024

Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

Comments

tkporter commented Feb 15, 2024 • edited Loading

daniel-savu commented May 27, 2024

tkporter commented Jul 1, 2024

tkporter commented Feb 15, 2024 •

edited

Loading