Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25081][Core]Nested spill in ShuffleExternalSorter should not access released memory page #22062

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public int compare(PackedRecordPointer left, PackedRecordPointer right) {
*/
private int usableCapacity = 0;

private int initialSize;
private final int initialSize;

ShuffleInMemorySorter(MemoryConsumer consumer, int initialSize, boolean useRadixSort) {
this.consumer = consumer;
Expand Down Expand Up @@ -94,12 +94,20 @@ public int numRecords() {
}

public void reset() {
// Reset `pos` here so that `spill` triggered by the below `allocateArray` will be no-op.
pos = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding: this is enough to fix the actual issue here right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to set usableCapacity to 0. Otherwise,

if (!inMemSorter.hasSpaceForAnotherRecord()) {
will not rethrow SparkOutOfMemoryError. ShuffleExternalSorter will keep running and finally touch array.

Setting array to null is just for safety so that anything incorrect use will fail with NPE.

Copy link
Contributor

@cxzl25 cxzl25 Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process of inMemSorter.reset() may throw oom in consumer.allocateArray(), at this time array=null.
ShuffleExternalSorter#cleanupResources called inMemSorter.getMemoryUsage(), throwing npe.
UnsafeExternalSorter#inMemSorter also has this problem.

Do I need to submit a pr to solve this problem? @zsxwing @hvanhovell

executor log

20/11/17 19:48:00,675 [Executor task launch worker for task 1721] ERROR UnsafeShuffleWriter: In addition to a failure during writing, we failed during cleanup.
java.lang.NullPointerException
	at org.apache.spark.shuffle.sort.ShuffleInMemorySorter.getMemoryUsage(ShuffleInMemorySorter.java:133)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.getMemoryUsage(ShuffleExternalSorter.java:269)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.updatePeakMemoryUsed(ShuffleExternalSorter.java:273)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.freeMemory(ShuffleExternalSorter.java:288)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.cleanupResources(ShuffleExternalSorter.java:304)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:174)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
	at org.apache.spark.scheduler.Task.run(Task.scala:116)
	at org.apache.spark.executor.Executor$TaskRunner.org$apache$spark$executor$Executor$TaskRunner$$runInternal(Executor.scala:353)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:296)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
20/11/17 19:48:00,682 [Executor task launch worker for task 1721] ERROR Executor: Exception in task 0.0 in stage 1917.0 (TID 1721)
java.lang.OutOfMemoryError: Unable to acquire 32768 bytes of memory, got 0
	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98)
	at org.apache.spark.shuffle.sort.ShuffleInMemorySorter.reset(ShuffleInMemorySorter.java:109)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.spill(ShuffleExternalSorter.java:256)
	at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:203)
	at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:281)
	at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:119)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:359)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:382)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:246)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:167)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
	at org.apache.spark.scheduler.Task.run(Task.scala:116)
	at org.apache.spark.executor.Executor$TaskRunner.org$apache$spark$executor$Executor$TaskRunner$$runInternal(Executor.scala:353)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:296)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

if (consumer != null) {
consumer.freeArray(array);
// As `array` has been released, we should set it to `null` to avoid accessing it before
// `allocateArray` returns. `usableCapacity` is also set to `0` to avoid any codes writing
// data to `ShuffleInMemorySorter` when `array` is `null` (e.g., in
// ShuffleExternalSorter.growPointerArrayIfNecessary, we may try to access
// `ShuffleInMemorySorter` when `allocateArray` throws SparkOutOfMemoryError).
array = null;
usableCapacity = 0;
array = consumer.allocateArray(initialSize);
usableCapacity = getUsableCapacity();
}
pos = 0;
}

public void expandPointerArray(LongArray newArray) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.shuffle.sort

import java.lang.{Long => JLong}

import org.mockito.Mockito.when
import org.scalatest.mockito.MockitoSugar

import org.apache.spark._
import org.apache.spark.executor.{ShuffleWriteMetrics, TaskMetrics}
import org.apache.spark.memory._
import org.apache.spark.unsafe.Platform

class ShuffleExternalSorterSuite extends SparkFunSuite with LocalSparkContext with MockitoSugar {

test("nested spill should be no-op") {
val conf = new SparkConf()
.setMaster("local[1]")
.setAppName("ShuffleExternalSorterSuite")
.set("spark.testing", "true")
.set("spark.testing.memory", "1600")
.set("spark.memory.fraction", "1")
sc = new SparkContext(conf)

val memoryManager = UnifiedMemoryManager(conf, 1)

var shouldAllocate = false

// Mock `TaskMemoryManager` to allocate free memory when `shouldAllocate` is true.
// This will trigger a nested spill and expose issues if we don't handle this case properly.
val taskMemoryManager = new TaskMemoryManager(memoryManager, 0) {
override def acquireExecutionMemory(required: Long, consumer: MemoryConsumer): Long = {
// ExecutionMemoryPool.acquireMemory will wait until there are 400 bytes for a task to use.
// So we leave 400 bytes for the task.
if (shouldAllocate &&
memoryManager.maxHeapMemory - memoryManager.executionMemoryUsed > 400) {
val acquireExecutionMemoryMethod =
memoryManager.getClass.getMethods.filter(_.getName == "acquireExecutionMemory").head
acquireExecutionMemoryMethod.invoke(
memoryManager,
JLong.valueOf(
memoryManager.maxHeapMemory - memoryManager.executionMemoryUsed - 400),
JLong.valueOf(1L), // taskAttemptId
MemoryMode.ON_HEAP
).asInstanceOf[java.lang.Long]
}
super.acquireExecutionMemory(required, consumer)
}
}
val taskContext = mock[TaskContext]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need mockito here? We can also create a TaskContextImpl by hand right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also create a TaskContextImpl by hand right?

I can. Just to save several lines :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

val taskMetrics = new TaskMetrics
when(taskContext.taskMetrics()).thenReturn(taskMetrics)
val sorter = new ShuffleExternalSorter(
taskMemoryManager,
sc.env.blockManager,
taskContext,
100, // initialSize - This will require ShuffleInMemorySorter to acquire at least 800 bytes
1, // numPartitions
conf,
new ShuffleWriteMetrics)
val inMemSorter = {
val field = sorter.getClass.getDeclaredField("inMemSorter")
field.setAccessible(true)
field.get(sorter).asInstanceOf[ShuffleInMemorySorter]
}
// Allocate memory to make the next "insertRecord" call triggers a spill.
val bytes = new Array[Byte](1)
while (inMemSorter.hasSpaceForAnotherRecord) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access to the hasSpaceForAnotherRecord is the only reason why we need reflection right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access to the hasSpaceForAnotherRecord is the only reason why we need reflection right?

Yes.

sorter.insertRecord(bytes, Platform.BYTE_ARRAY_OFFSET, 1, 0)
}

// This flag will make the mocked TaskMemoryManager acquire free memory released by spill to
// trigger a nested spill.
shouldAllocate = true

// Should throw `SparkOutOfMemoryError` as there is no enough memory: `ShuffleInMemorySorter`
// will try to acquire 800 bytes but there are only 400 bytes available.
//
// Before the fix, a nested spill may use a released page and this causes two tasks access the
// same memory page. When a task reads memory written by another task, many types of failures
// may happen. Here are some examples we have seen:
//
// - JVM crash. (This is easy to reproduce in the unit test as we fill newly allocated and
// deallocated memory with 0xa5 and 0x5a bytes which usually points to an invalid memory
// address)
// - java.lang.IllegalArgumentException: Comparison method violates its general contract!
// - java.lang.NullPointerException
// at org.apache.spark.memory.TaskMemoryManager.getPage(TaskMemoryManager.java:384)
// - java.lang.UnsupportedOperationException: Cannot grow BufferHolder by size -536870912
// because the size after growing exceeds size limitation 2147483632
intercept[SparkOutOfMemoryError] {
sorter.insertRecord(bytes, Platform.BYTE_ARRAY_OFFSET, 1, 0)
}
}
}