Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp/lighthorizon/services: Fetch ledgers in parallel to them being processed. #4500

Closed

Conversation

Shaptic
Copy link
Contributor

@Shaptic Shaptic commented Aug 2, 2022

What

This modifies the ledger download+processing code to download subsets of the entire checkpoint range in parallel. Preliminary testing suggests that this can massively reduce request latency.

Before:

"GET http://localhost:8080/accounts/GBBJPGXTNNYBQGAIGONBFB6P7SUNO75GJEQ4Y7FVGMB2GPS3DJQVXSEP/transactions?cursor=179115963198013441&limit=10 HTTP/1.1" from 127.0.0.1:38834 - 200 199360B in 55.8414775s

After:

"GET http://localhost:8080/accounts/GBBJPGXTNNYBQGAIGONBFB6P7SUNO75GJEQ4Y7FVGMB2GPS3DJQVXSEP/transactions?cursor=179115963198013441&limit=10 HTTP/1.1" from 127.0.0.1:54188 - 200 199360B in 16.043121887s

(Note that the indices are local and no on-disk cache was used.)

Why

We can reduce latency by doing ledger downloads in parallel to processing. See #4468.

Known limitations

Obviously, neither of these represent acceptable levels of latency if we're talking about parity with classic Horizon. However, it's still a massive improvement over the previous code if we decide that high latency is acceptable for deeply historical requests.

@Shaptic Shaptic self-assigned this Aug 2, 2022
@Shaptic Shaptic changed the title exp/lighthorizon: Lighthorizon parallel fetch exp/lighthorizon/services: Fetch ledgers in parallel to processing. Aug 2, 2022
@Shaptic Shaptic changed the title exp/lighthorizon/services: Fetch ledgers in parallel to processing. exp/lighthorizon/services: Fetch ledgers in parallel to them being processed. Aug 2, 2022
@Shaptic Shaptic force-pushed the lighthorizon_parallelFetch branch from 44b71f7 to b779bf7 Compare August 2, 2022 20:54
On("GetLedger", mock.Anything, uint32(1586112)).Return(expectedLedger1, nil).
On("GetLedger", mock.Anything, uint32(1586113)).Return(expectedLedger2, nil).
On("GetLedger", mock.Anything, uint32(1586114)).Return(expectedLedger3, nil).
On("GetLedger", mock.Anything, mock.Anything).Return(xdr.LedgerCloseMeta{}, nil)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone know how to do a "any context.Context instance" equivalent here? I tried mock.AnythingOfType but that wasn't it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the syntax, looks like mock.AnythingOfType("context.Context") shoulda worked?

@Shaptic Shaptic requested review from sreuland, 2opremio and a team August 4, 2022 16:00
@@ -10,6 +10,7 @@ import (
type CursorManager interface {
Begin(cursor int64) (int64, error)
Advance() (int64, error)
Skip(count uint) (int64, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not suggestion change, just a question on interface ergo, do you think Advance(skip uint) would be less moving parts for same new functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting idea, I think it's pretty ergo. seeing Advance(0) becomes a little unintuitive, but I'll try it out

Copy link
Contributor

@sreuland sreuland Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, subtle, rather than skip semantics, it could be Advance(increment uint) so Advance(0) would be no-op ?

// returned group have completed. In contrast, this function closes the output
// channel when all work has been submitted (or the context errors).
//
// FIXME: Should this be a part of archive.Archive?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, to refactor this ledger loading optimization to be encapsulated in Archive instead, i.e. this caller method would largely be unchanged, still uses Archive.GetLedger(ctx, nextLedger), the new perf optimizations for loading get relo'd within there ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started working through refactoring this and realized that we actually probably don't need most of the Archive interface anymore after Bartek's work in #4488 which gives us a clean MetaArchive abstraction, which I think was a big point of what we were trying to accomplish with Archive.

I'll try to throw up a PR tomorrow to clean that up.

Copy link
Contributor

@sreuland sreuland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really good, existing tests exercise this right? I think is worthwhile to follow your FIXME intuition now rather than wait, and refactor the logic into Archive.GetLedger , it may incur different unit test coverage..I think tests on this service mocked out Archive.

@Shaptic Shaptic force-pushed the lighthorizon_parallelFetch branch from b779bf7 to b733b48 Compare August 8, 2022 23:46
@Shaptic
Copy link
Contributor Author

Shaptic commented Aug 17, 2022

Abandoning in lieu of two separate PRs related to the comment about hiding this behind Archive.GetLedger().

@Shaptic Shaptic closed this Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants