Skip to content

Commit

Permalink
[SPARK-3731] [PySpark] fix memory leak in PythonRDD
Browse files Browse the repository at this point in the history
The parent.getOrCompute() of PythonRDD is executed in a separated thread, it should release the memory reserved for shuffle and unrolling finally.

Author: Davies Liu <[email protected]>

Closes apache#2668 from davies/leak and squashes the following commits:

ae98be2 [Davies Liu] fix memory leak in PythonRDD
  • Loading branch information
davies authored and JoshRosen committed Oct 7, 2014
1 parent 6550329 commit bc87cc4
Showing 1 changed file with 5 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,11 @@ private[spark] class PythonRDD(
// will kill the whole executor (see org.apache.spark.executor.Executor).
_exception = e
worker.shutdownOutput()
} finally {
// Release memory used by this thread for shuffles
env.shuffleMemoryManager.releaseMemoryForThisThread()
// Release memory used by this thread for unrolling blocks
env.blockManager.memoryStore.releaseUnrollMemoryForThisThread()
}
}
}
Expand Down

0 comments on commit bc87cc4

Please sign in to comment.