-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4452][Core]Shuffle data structures can starve others on the same thread for memory #7130
Conversation
Test build #36152 has finished for PR 7130 at commit
|
import org.apache.spark.SparkEnv | ||
import org.apache.spark.{Logging, SparkEnv, Spillable} | ||
|
||
import scala.reflect.ClassTag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: scala imports before spark imports
Test build #36155 has finished for PR 7130 at commit
|
val manager = new ShuffleMemoryManager(1000L) | ||
|
||
val spill1 = new FakeSpillable() | ||
val spill2 = new FakeSpillable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spill2
is unused
Test build #36157 has finished for PR 7130 at commit
|
Test build #36162 has finished for PR 7130 at commit
|
Test build #36171 has finished for PR 7130 at commit
|
Test build #36270 has finished for PR 7130 at commit
|
Test build #36281 has finished for PR 7130 at commit
|
/bump @andrewor14 @rxin, I think that we should look at this before 1.5. I share @lianhuiwang's concerns that our new usages of ShuffleMemoryManager will make the longstanding starvation problem even worse. @andrewor14, I think that you know a bit more of the history involved in this logic. Didn't we used to have the opposite problem where multiple collections in the same thread (or task, after #7734) may lead to endless spilling behavior? |
Currently when there are 2 spillable objects in a thread, ShuffleMemoryManager does not work well. some time latter spillable just can ask 5MB.
when join on #5868#, because join has ExternalAppendOnlyMap and UnsafeShuffleExternalSorter’s page size is 128MB, UnsafeShuffleExternalSorter always cannot ask PAGE_SIZE memory and throw IOException("Unable to acquire " + PAGE_SIZE + " bytes of memory”).
in the future, when using #6444’s binary processing sort on unsafe-shuffle-sort, the problem is more serious.
in this PR, when previous spillable is finished,add it to ShuffleMemoryManager’s reserved list. when latter spillable ask memory,but ShuffleMemoryManager has no enough memory, it spill previous spillable and get its memory. that make latter spoilable get more memory.
@JoshRosen @andrewor14 @sryza i think you are interesting in this. and can you take a look? thanks.