forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice
The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API. Existing unit tests. Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths. Authored-by: mcheah <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
- Loading branch information
Showing
8 changed files
with
110 additions
and
89 deletions.
There are no files selected for viewing
35 changes: 35 additions & 0 deletions
35
core/src/main/java/org/apache/spark/shuffle/api/MapOutputWriterCommitMessage.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
package org.apache.spark.shuffle.api; | ||
|
||
import java.util.Optional; | ||
|
||
import org.apache.spark.annotation.Experimental; | ||
import org.apache.spark.storage.BlockManagerId; | ||
|
||
@Experimental | ||
public final class MapOutputWriterCommitMessage { | ||
|
||
private final long[] partitionLengths; | ||
private final Optional<BlockManagerId> location; | ||
|
||
private MapOutputWriterCommitMessage(long[] partitionLengths, Optional<BlockManagerId> location) { | ||
this.partitionLengths = partitionLengths; | ||
this.location = location; | ||
} | ||
|
||
public static MapOutputWriterCommitMessage of(long[] partitionLengths) { | ||
return new MapOutputWriterCommitMessage(partitionLengths, Optional.empty()); | ||
} | ||
|
||
public static MapOutputWriterCommitMessage of( | ||
long[] partitionLengths, java.util.Optional<BlockManagerId> location) { | ||
return new MapOutputWriterCommitMessage(partitionLengths, location); | ||
} | ||
|
||
public long[] getPartitionLengths() { | ||
return partitionLengths; | ||
} | ||
|
||
public Optional<BlockManagerId> getLocation() { | ||
return location; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.