Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a zero-copy deserializer to gRPC Read #564

Merged
merged 9 commits into from
Jul 28, 2021

Conversation

veblush
Copy link
Contributor

@veblush veblush commented May 19, 2021

This PR add an gRPC read optimization using the zero-copy deserializer and reduces two memory copies involved. This is based on grpc/grpc-java#8102 which will be released from grpc-java 1.39.0 and tested earlier with GoogleCloudPlatform/grpc-gcp-java#77.

@google-cla google-cla bot added the cla: yes label May 19, 2021
@veblush
Copy link
Contributor Author

veblush commented May 20, 2021

Following are the result of benchmark downloading 1GiB file 10 times with gRPC over DirectPath.

Flamegraph with gcsio 2.2.1

Screen Shot 2021-05-19 at 4 58 41 PM

Flamegraph with gcsio 2.2.1 with this PR

Screen Shot 2021-05-19 at 4 58 46 PM

@veblush veblush force-pushed the zero-copy branch 2 times, most recently from ad722d1 to 87f281c Compare June 3, 2021 23:26
@veblush veblush changed the title [DO NOT MERGE] Added a zero-copy deserializer to gRPC Read Added a zero-copy deserializer to gRPC Read Jul 21, 2021
@veblush veblush marked this pull request as ready for review July 21, 2021 15:57
@veblush
Copy link
Contributor Author

veblush commented Jul 21, 2021

@mprashanthsagar Could you please review this PR? This is now ready to review.

@mprashanthsagar
Copy link
Contributor

mprashanthsagar commented Jul 21, 2021

/gcbrun

@codecov
Copy link

codecov bot commented Jul 21, 2021

Codecov Report

Merging #564 (3d80ba1) into master (4ac88ba) will decrease coverage by 0.00%.
The diff coverage is 75.49%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #564      +/-   ##
============================================
- Coverage     80.34%   80.33%   -0.01%     
- Complexity     2074     2088      +14     
============================================
  Files           140      142       +2     
  Lines          9027     9120      +93     
  Branches       1074     1083       +9     
============================================
+ Hits           7253     7327      +74     
- Misses         1333     1348      +15     
- Partials        441      445       +4     
Flag Coverage Δ
hadoop2integrationtest 60.20% <66.66%> (+0.11%) ⬆️
hadoop2unittest 68.60% <70.58%> (+0.01%) ⬆️
hadoop3integrationtest 60.12% <66.66%> (+0.06%) ⬆️
hadoop3unittest 68.64% <70.58%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...e/cloud/hadoop/gcsio/ZeroCopyReadinessChecker.java 65.21% <65.21%> (ø)
...adoop/gcsio/GoogleCloudStorageGrpcReadChannel.java 85.17% <76.47%> (-0.79%) ⬇️
.../cloud/hadoop/gcsio/ZeroCopyMessageMarshaller.java 80.00% <80.00%> (ø)
...ogle/cloud/hadoop/gcsio/PrefixMappedItemCache.java 70.12% <0.00%> (+5.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ac88ba...3d80ba1. Read the comment docs.

@veblush veblush requested review from medb and mprashanthsagar July 23, 2021 16:53
@veblush veblush requested a review from mprashanthsagar July 23, 2021 19:29
Copy link
Contributor

@mprashanthsagar mprashanthsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add a test case to cover, not using zero-copy deserializer path ?

Rest of the changes LGTM

@veblush
Copy link
Contributor Author

veblush commented Jul 23, 2021

@mprashanthsagar Adding some unit tests for ZeroCopyMessageMarshaller would be possible. But what do you mean by not using zero-copy deserializer path?

Copy link
Contributor

@mprashanthsagar mprashanthsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests in ReadChannel with the case useZeroCopyMarshaller being false

@veblush
Copy link
Contributor Author

veblush commented Jul 23, 2021

Tests in ReadChannel with the case useZeroCopyMarshaller being false

Added a unit test for ZeroCopyMessageMarshaller for both fast and slow case but it doesn't seem feasible to add tests to GoogleCloudStorageGrpcReadChannelTest practicing this because all tests use mock objects which don't actually do deserialization.

@medb medb changed the title Added a zero-copy deserializer to gRPC Read Add a zero-copy deserializer to gRPC Read Jul 28, 2021
@mprashanthsagar mprashanthsagar merged commit a87b89e into GoogleCloudDataproc:master Jul 28, 2021
mayanks pushed a commit to mayanks/hadoop-connectors that referenced this pull request Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants