-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid materializing list of segment files when finding a partition file during shuffle #11903
Conversation
…essure) as well as looping over the files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change, @loquisgon . This greatly simplifies the logic here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This pull request introduces 1 alert when merging 94e419e into d3914c1 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! Are there existing unit tests that cover these 2 branches?
final File[] segmentFiles = partitionDir.listFiles(); | ||
if (segmentFiles == null) { | ||
return Optional.empty(); | ||
final File segmentFile = new File(partitionDir, subTaskId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM is complaining because subTaskId
is passed in from user input. A validation check similar to the one done for supervisorTaskId
on line 375 should suppress the warning from LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new code is semantically the same, with the same branching (i.e. partition found it returns it else it returns Optional.empty() so I think we are good. The Optional.empty() return though is not being hit by any existing unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 👍
Avoid materializing list of segment files (it can cause OOM/memory pressure) as well as looping over the files.
Description
This fixes a memory pressure/OOM bug seen in the field when the middlemanager may get an "out of memory" error because the list of segments is too large. Besides, looping through the segments files is inefficient and this commit avoids that. It does not need new unit tests, the existing unit tests cover this code already.
This PR has: