-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storageccl: pull in fix for reading GCS files #24880
Conversation
Oops hold on. Don't review yet. |
The linked issue lists a commit attempting to fix occasional problems we see during large GCS reads. They haven't been able to repro it but think this will fix it. Pull in that update so we can run it for a few months for testing. See: googleapis/google-cloud-go#784 Release note: None
Ok ready. PTAL. |
provided you're not planning to backport this. (If you are, we should cherry-pick the bugfix directly on top of the version of google-cloud-go that we were previously using.) But please also open an issue with milestone 2.1 to track updating google-cloud-go to a proper release! Reviewed 3 of 3 files at r1. Comments from Reviewable |
No plans to backport this until the goog folks tag it and we've tested it enough in our nightlies to think it works. I want it in here so we can get a few months of testing to tell them if the bug went away or not. |
bors r+ |
24880: storageccl: pull in fix for reading GCS files r=mjibson a=mjibson The linked issue lists a commit attempting to fix occasional problems we see during large GCS reads. They haven't been able to repro it but think this will fix it. Pull in that update so we can run it for a few months for testing. See: googleapis/google-cloud-go#784 Release note: None 24896: engine: find split keys in the first range of a partition r=tschottdorf,a-robinson,bdarnell,danhhz a=benesch This deserves a roachtest—roachmart should make sure it has the correct number of ranges after inserting data—but I wanted to get this out for review ASAP. --- MVCCFindSplitKey would previously fail to find any split keys in the first range of a partition. As a result, partitioned tables have been observed with multi-gigabyte ranges. This commit fixes the bug. Specifically, MVCCFindSplitKey was assuming that the start key of a range within a table was also the row prefix for the first row of data in the range. This does not hold true for the first range of a table or a partition of a table--that range begins at, for example, /Table/51, while the row begins at /Table/51/1/aardvark. The old code had a special case for the first range in a table, but not for the first range in a partition. (It predates partitioning.) Remove the need for special casing by actually looking in RocksDB to determine the row prefix for the first row of data rather than attempting to derive it from the range start key. This properly handles partitioning and is robust against future changes to range split boundaries. See the commit within for more details on the approach. Release note (bug fix): Ranges in partitioned tables now properly split to respect their configured maximum size. Co-authored-by: Matt Jibson <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]>
Build succeeded |
The linked issue lists a commit attempting to fix occasional problems
we see during large GCS reads. They haven't been able to repro it
but think this will fix it. Pull in that update so we can run it for
a few months for testing.
See: googleapis/google-cloud-go#784
Release note: None