Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storageccl: pull in fix for reading GCS files #24880

Merged
merged 1 commit into from
Apr 18, 2018
Merged

storageccl: pull in fix for reading GCS files #24880

merged 1 commit into from
Apr 18, 2018

Conversation

maddyblue
Copy link
Contributor

The linked issue lists a commit attempting to fix occasional problems
we see during large GCS reads. They haven't been able to repro it
but think this will fix it. Pull in that update so we can run it for
a few months for testing.

See: googleapis/google-cloud-go#784

Release note: None

@maddyblue maddyblue requested review from dt, bobvawter and a team April 17, 2018 19:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@maddyblue
Copy link
Contributor Author

Oops hold on. Don't review yet.

The linked issue lists a commit attempting to fix occasional problems
we see during large GCS reads. They haven't been able to repro it
but think this will fix it. Pull in that update so we can run it for
a few months for testing.

See: googleapis/google-cloud-go#784

Release note: None
@maddyblue
Copy link
Contributor Author

Ok ready. PTAL.

@benesch
Copy link
Contributor

benesch commented Apr 18, 2018

:lgtm: provided you're not planning to backport this. (If you are, we should cherry-pick the bugfix directly on top of the version of google-cloud-go that we were previously using.)

But please also open an issue with milestone 2.1 to track updating google-cloud-go to a proper release!


Reviewed 3 of 3 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@maddyblue
Copy link
Contributor Author

No plans to backport this until the goog folks tag it and we've tested it enough in our nightlies to think it works. I want it in here so we can get a few months of testing to tell them if the bug went away or not.

@maddyblue
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Apr 18, 2018
24880: storageccl: pull in fix for reading GCS files r=mjibson a=mjibson

The linked issue lists a commit attempting to fix occasional problems
we see during large GCS reads. They haven't been able to repro it
but think this will fix it. Pull in that update so we can run it for
a few months for testing.

See: googleapis/google-cloud-go#784

Release note: None

24896: engine: find split keys in the first range of a partition r=tschottdorf,a-robinson,bdarnell,danhhz a=benesch

This deserves a roachtest—roachmart should make sure it has the correct number of ranges after inserting data—but I wanted to get this out for review ASAP.

---

MVCCFindSplitKey would previously fail to find any split keys in the
first range of a partition. As a result, partitioned tables have been
observed with multi-gigabyte ranges. This commit fixes the bug.

Specifically, MVCCFindSplitKey was assuming that the start key of a
range within a table was also the row prefix for the first row of data
in the range. This does not hold true for the first range of a table or
a partition of a table--that range begins at, for example, /Table/51,
while the row begins at /Table/51/1/aardvark. The old code had a special
case for the first range in a table, but not for the first range in a
partition. (It predates partitioning.)

Remove the need for special casing by actually looking in RocksDB to
determine the row prefix for the first row of data rather than
attempting to derive it from the range start key. This properly handles
partitioning and is robust against future changes to range split
boundaries.

See the commit within for more details on the approach.

Release note (bug fix): Ranges in partitioned tables now properly split
to respect their configured maximum size.

Co-authored-by: Matt Jibson <[email protected]>
Co-authored-by: Nikhil Benesch <[email protected]>
@craig
Copy link
Contributor

craig bot commented Apr 18, 2018

Build succeeded

@craig craig bot merged commit c975931 into cockroachdb:master Apr 18, 2018
@maddyblue maddyblue deleted the go-cloud-reader branch April 18, 2018 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants