Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor relationships between tasks to not rely on reading nonce by nonce from the DB #3269

Closed
Tracked by #3414
tkporter opened this issue Feb 15, 2024 · 2 comments
Closed
Tracked by #3414
Assignees

Comments

@tkporter
Copy link
Collaborator

tkporter commented Feb 15, 2024

Atm, in a few places places we use the DB as shared memory to communicate between tasks. Sometimes this makes sense, like when building the in-memory merkle tree, which requires starting from leaf 0, then leaf 1, leaf 2, etc. But this doesn't make as much sense for the MessageProcessor, which doesn't necessarily need to start considering messages from the beginning of time. We index messages in a forward-backward manner so that we learn about new messages immediately upon starting up, but the relayer isn't able to deliver any of them until the backward indexer finishes because of the current MessageProcessor logic

Concretely we should:

  • See if there are any other spots where we do this "start from 0, loop through the db looking for the next one" logic where it doesn't make sense to
  • Change MessageProcessor or any other tasks where it makes sense to change to not rely on nonces from 0

Ideas for how we can change things:

  1. Instead of reading from the DB by nonce starting from 0, for all indexed messages we also add an "insertion index" starting from 0. The very first message ever indexed (regardless of the nonce) is 0, the second 1, etc.
    a. This requires a change to the local DB schema, and likely some migration logic so that we don't need to index all messages from scratch :(
  2. Forward-backward style reading through the DB - this way we prefer newer messages
  3. Maybe there's some way to iterate through RockDB data in a way that sorts by insertion date?
    a. There's probably a cursed way of doing something like this. RocksDB has sequence numbers that it assigns to inserted data. It's possible to use DB::get_updates_since which gives a DBWALIterator that lets you iterate through all batches of write operations since a provided sequence number.
    • Cursed because we'll be reading through every single write operation, and I have a feeling we're not really meant to be iterating through a WAL lol
    • We'll probably need a different way to iterate through the DB upon startup. Iterating through RocksDB given a prefix is a seemingly well handled use case so we can do this easily
  4. Following Hook indexing learns about new messages as they are indexed #3267, we'll have broadcast channels that we'll send each newly indexed message through. We could use this to have our message processing tasks learn about new messages that have just been indexed. Upon startup, for messages that aren't newly indexed, we can iterate through RocksDB by prefix to find all the already-indexed messages
@daniel-savu
Copy link
Contributor

update: the iteration strategy of the message processor has been changed to forward-backward in #3775

I believe the only change still required is to change the iteration method for IGP as well, which will be possible once #3267 is rolled out

@daniel-savu daniel-savu moved this from Next Sprint to Sprint in Hyperlane Tasks May 27, 2024
@tkporter
Copy link
Collaborator Author

tkporter commented Jul 1, 2024

I'm gonna close this in favor of #3885

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants