-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix two potential OOM issues in GPU aggregate. #11908
base: branch-25.02
Are you sure you want to change the base?
Conversation
The first one is by taking the nested literals into account when calculating the output size for pre-split. The second one is by using the correct size for buffer size comparison when collecting the next bundle of batches in aggregate. Signed-off-by: Firestarman <[email protected]>
currentSize += bucket.map(_.sizeInBytes).sum | ||
toAggregateBuckets += bucket | ||
var keepGoing = true | ||
while (batchesByBucket.nonEmpty && keepGoing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may need use a separate PR for this as it is irrelevant to the description in #11903
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for review. @binmahone Could you help file an issue for this? Then I will follow your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also consider modifying the description in 11903 :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
case ArrayType(elemType, hasNullElem) => | ||
val numElems = pickRowNum(litVal.asInstanceOf[ArrayData].numElements()) | ||
// A GPU array literal requires only one column as the child | ||
estimateLitAdditionSize(hasNullElem, hasOffset(elemType), numElems) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a risk of over estimation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is more about accuracy. For fixed width types we were able to estimate the size almost exactly. For items with offsets we could not and we estimated values more on the small end of things. We were being conservative. For literal values we should be able to get a value that is almost exactly the right size. We can test this for many different literal values and see how close we end up getting.
My concern is for highly nested types. We recurse for the calcLitValueSize
, but not for estimateLitAdditionSize
. So if we have literal values that are highly nested (like array of array), then the estimate is going to be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done the refactor to calculate the size more exactly.
I follow the type definitions from https://github.com/rapidsai/cudf/blob/a0487be669326175982c8bfcdab4d61184c88e27/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md#list-columns
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a lot of testing in different cases to be sure that it works properly. Especially for nested types.
val pickRowNum: Int => Int = rowNum => if (litVal == null) 0 else rowNum | ||
litType match { | ||
case ArrayType(elemType, hasNullElem) => | ||
val numElems = pickRowNum(litVal.asInstanceOf[ArrayData].numElements()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a null check here for a null array literal value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated by the refactor.
estimateLitAdditionSize(f.nullable, hasOffset(f.dataType), childrenNumRows) | ||
).sum | ||
case MapType(keyType, valType, hasNullValue) => | ||
val mapRowsNum = pickRowNum(litVal.asInstanceOf[MapData].numElements()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too needs a check for litVal being null before this runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated by the refactor.
case StringType => lit.asInstanceOf[UTF8String].numBytes() | ||
case BinaryType => lit.asInstanceOf[Array[Byte]].length | ||
case ArrayType(elemType, _) => | ||
lit.asInstanceOf[ArrayData].array.map(calcLitValueSize(_, elemType)).sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to only work on UnsafeArrayData. If I test with the following it fails. (I opened up the API to be public so I could test things manually)
scala> PreProjectSplitIterator.calcMemorySizeForLiteral(ArrayData.toArrayData(Array(1L)), ArrayType(LongType), 100)
java.lang.UnsupportedOperationException: Not supported on UnsafeArrayData.
at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.array(UnsafeArrayData.java:103)
at com.nvidia.spark.rapids.PreProjectSplitIterator$.calcLitValueSize(basicPhysicalOperators.scala:381)
at com.nvidia.spark.rapids.PreProjectSplitIterator$.calcMemorySizeForLiteral(basicPhysicalOperators.scala:328)
... 51 elided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, thx a lot, updated.
Thx for review, I am doing a small refactor for the meta size calculation part... Will update it once done. |
Signed-off-by: Firestarman <[email protected]>
Update for early review, I will run some tests next. |
Will work on this next. |
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I still would like to see some tests (at least some simple unit tests) to validate that it is doing the right thing and that we are covering corner cases like nulls, and arrays of arrays.
Sure, working on the tests now... |
Signed-off-by: Firestarman <[email protected]>
Added some tests, also found some bugs and fixed them. |
build |
close #11903
The first one is by taking the nested literals into account when calculating the output size for pre-split. See the linked issue above for more details.
The second one is by using the correct size for buffer size comparison when collecting the next bundle of batches in aggregate. The size return from the
batchesByBucket.last.size()
is not the actual buffer size in bytes,but the element number of an array. It can not be used for the buffer size comparison.
I verified this PR locally by the toy query and it works well.