Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store-gateway: retain lazy-loaded index headers between restarts #4762

Closed
3 tasks done
dimitarvdimitrov opened this issue Apr 18, 2023 · 1 comment · Fixed by #5606
Closed
3 tasks done

store-gateway: retain lazy-loaded index headers between restarts #4762

dimitarvdimitrov opened this issue Apr 18, 2023 · 1 comment · Fixed by #5606

Comments

@dimitarvdimitrov
Copy link
Contributor

dimitarvdimitrov commented Apr 18, 2023

Context

The store-gateway can lazily load the index header of a block when each block is requested by the querier. The store-gateway loses these loaded index headers upon restart and also unloads them when they haven't been queried after time idle period (3h by default). Loading one index header can take between seconds and minutes.

Related to #4763

Problem

When the store-gateway crashes, is rescheduled on a new node or rolled out with a new version it loses the index headers. This means that subsequent queries for the blocks of these index headers will suffer a latency increase.

Proposal

Retain the list of lazily-loaded index headers on disk and load the non-lazily after a restart.

  • We can periodically persist the list of headers (@ 1 min) to a file on disk. Every minute we overwrite the file with the current list of block IDs. The list of lazily-loaded index headers is kept here.
    • When persisting the file to disk it might be a good idea to also checksum it so that we can recover from forced shutdowns.
    • It might be useful to also persist the time of the last access to the index header so that we can unload idle index headers. Otherwise, frequent restarts may mean effectively disabling lazy-loading.
  • Upon startup the store-gateway reads the file and load the index headers that are listed in the file
    • The existing concurrency controls (blocks-storage.bucket-store.meta-sync-concurrency, its tenant and its block equivalents) should probably apply in this loading as well
  • The store-gateway doesn't pass its readiness check until it has loaded all the previously loaded index headers.

Alternatives

Tasks

Preview Give feedback
@charleskorn
Copy link
Contributor

If we were to implement this, we'd probably want some kind of protection against repeatedly trying to load the list of headers and then crashing during reading.

For example, imagine a scenario where we've reduced the amount of memory requested for store-gateways, but a store-gateway has a list of headers on disk that can't be loaded in the reduced amount of memory. We'd end up in a situation where we keep trying to load the full list, OOMing, then restarting and trying again.

One solution might be to write a marker to disk before trying to load all the headers in the list, and remove it after successfully loading the list. If the store-gateway sees that the last attempt to load the list failed (because the marker is still there), then it discards the list and starts with no pre-loaded headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants