Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26228][MLLIB] OOM issue encountered when computing Gramian matrix #23600

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Jan 21, 2019

What changes were proposed in this pull request?

Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

How was this patch tested?

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

…s (>= 16K rows/cols) by using null as zeroValue
@srowen srowen self-assigned this Jan 21, 2019
@srowen srowen requested a review from mengxr January 21, 2019 01:33
@SparkQA
Copy link

SparkQA commented Jan 21, 2019

Test build #101457 has finished for PR 23600 at commit eef9ea0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in 6dcad38 Jan 23, 2019
srowen added a commit that referenced this pull request Jan 23, 2019
Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

Closes #23600 from srowen/SPARK-26228.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 6dcad38)
Signed-off-by: Sean Owen <[email protected]>
srowen added a commit that referenced this pull request Jan 23, 2019
Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

Closes #23600 from srowen/SPARK-26228.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 6dcad38)
Signed-off-by: Sean Owen <[email protected]>
@srowen
Copy link
Member Author

srowen commented Jan 23, 2019

Merged to master/2.4/2.3

@srowen srowen deleted the SPARK-26228 branch February 14, 2019 21:52
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

## How was this patch tested?

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

Closes apache#23600 from srowen/SPARK-26228.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

Closes apache#23600 from srowen/SPARK-26228.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 6dcad38)
Signed-off-by: Sean Owen <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
Avoid memory problems in closure cleaning when handling large Gramians (>= 16K rows/cols) by using null as zeroValue

Existing tests.
Note that it's hard to test the case that triggers this issue as it would require a large amount of memory and run a while. I confirmed locally that a 16K x 16K Gramian failed with tons of driver memory before, and didn't fail upfront after this change.

Closes apache#23600 from srowen/SPARK-26228.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 6dcad38)
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants