-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-11867. Add a high-performance vectored read API. #3904
HADOOP-11867. Add a high-performance vectored read API. #3904
Conversation
Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BufferedFSInputStream.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java pom.xml
This is based on #3499. The plan is to merge this as a base commit in the feature branch and then split work on the final changes ( other optimisations and features) up into new PRs, each with their own JIRA. |
More tests and checkstyle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apart from minor details, this is ready to go into the branch for testing/tuning.
...oop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractVectoredReadTest.java
Outdated
Show resolved
Hide resolved
...oop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractVectoredReadTest.java
Outdated
Show resolved
Hide resolved
...oop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractVectoredReadTest.java
Outdated
Show resolved
Hide resolved
...oop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractVectoredReadTest.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-benchmark/src/main/java/org/apache/hadoop/benchmark/package-info.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
Yetus failing because of one known test failure. https://issues.apache.org/jira/browse/YARN-10788
Also this only checkstyle error is not introduced by my patch It was already there. I think it is time to merge this now. |
+1 to merge into this branch as feature complete, now we follow up with the final integration work.
|
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java pom.xml
part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur
Description of PR
Adding support for multiple ranged read async api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially object stores implementation.
How was this patch tested?
Added benchmarks.
Added UT's
Added new contract tests for new API spec.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?