Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: teach makeSimpleImportSpans to create restore spans at a target input size #83139

Closed
msbutler opened this issue Jun 21, 2022 · 2 comments
Assignees
Labels
A-disaster-recovery C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-disaster-recovery

Comments

@msbutler
Copy link
Collaborator

msbutler commented Jun 21, 2022

Currently, makeSimpleImportSpans creates a slice of in-order key spans ([]execinfrapb.RestoreSpanEntry), each of which will get a dedicated range for ingestion. Once filled by RESTORE, these ranges tend to be too small, causing the cluster to issue a series of expensive range merges after the RESTORE. Instead, makeSimpleImportSpans should create a span with a hardcoded target size, like say 512 mb, a default range's worth of data.

To minimize post RESTORE merges (and even splits), RESTORE would ideally ingest a single range's worth of data into each range it creates. Without refactoring restore's distsql processors, this implies makeSimpleImportSpans should create each RestoreSpanEntry such that the amount of data ingested in its corresponding range is bounded by 512 mb, the default range size. Unfortunately, makeSimpleImportSpans can only make a reasonable estimate on a span's input data size based on the size of backup files, which may contain multiple MVCC values for a given key, and not a span's ingested data size, which contains only one MVCC value for a given key. Hence, the target size of each requiredSpan is an estimate of its input data size (aka the amount work), which is strictly less than or equal to the span's ingested size. Note that RESTORE would much rather slightly under-fill a range (merge is only triggered if range is <512/2) than slightly overfill a range (likely to trigger a split); hence, creating an upper bound on a restoreSpanEntry based on input size seems reasonable.

To implement, consider how makeSimpleImportSpans currently partitions the restoring key span. makeSimpleImportSpans does one iteration through the inputed slice of spans that we want to the restore (requiredSpans) and greedily merges these spans into restoreSpanEntrys such that the input spans in a restoreSpanEntry are covered by the same set of backup files.

Instead of obeying the "common backup files" invariant, makeSimpleImportSpan should merge enough input spans together until the estimated input size of the restoreSpanEntry reaches the target input size. To track the input size on a given span, makeSimpleImportSpans should keep a running sum of the logical file size of all backup files associated with that span. Note that a backup file's size may get counted several times across requiredSpans, but this probably wouldn't bias the size estimates too significantly because backup files tend to be quite small. Further, over estimating the amount of data in a span entry would only lead to an under-filled range, which isn't that big of a deal, as mentioned above.

Once this is implemented, consider increasing the target size to be C*512mb, where C could be 2 - 4, and allow the SSTBatcher to split when a range fills (#83144).

Jira issue: CRDB-16886

Epic CRDB-10127

@msbutler msbutler added C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-disaster-recovery labels Jun 21, 2022
@msbutler msbutler self-assigned this Jun 21, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jun 21, 2022

cc @cockroachdb/bulk-io

@msbutler
Copy link
Collaborator Author

Fixed by #86496, though by default, the target span size is not used.

Further, David's implementation is slightly different from the implementation outlined in this issue, see comment here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-disaster-recovery
Projects
No open projects
Archived in project
Development

No branches or pull requests

1 participant