-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-2986: Fails the file writing when footer size exceeds int max value #2987
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Have you checked the performance penalty after adding the check?
} | ||
} | ||
} | ||
|
||
private int toIntWithCheck(long size) { | ||
private static int toIntWithCheck(long size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, it looks weird to me that the exception is about Encoding
.
if ((int) size != size) { | ||
throw new ParquetEncodingException( | ||
"Cannot write page larger than " + Integer.MAX_VALUE + " bytes: " + size); | ||
} | ||
return (int) size; | ||
} | ||
|
||
private static int footerSizeToInt(long size) { | ||
if ((int) size != size) { | ||
throw new ParquetFileWriteException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about ParquetSizeOverflowException
? Perhaps we can also consolidate toIntWithCheck
and footerSizeToInt
into a single method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to ParquetSizeOverflowException, and removed the footerSizeToInt
. The toIntWithCheck
is a common method, it would be better to move to the common module. There is already a common utils class Ints
, however, it is deprecated. Maybe we can enable it since we can provide more information than Math.toIntExact
.
@wgtmac thanks for your review.
I have not checked. Is there any performance util class in this project? I can do it. This should not have some performance regression since the checks are only for each page or file. |
/** | ||
* RuntimeException occurs when size overflow. | ||
*/ | ||
public class ParquetSizeOverflowException extends ParquetRuntimeException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it into the common module since it is a common class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Rationale for this change
Closes #2986
What changes are included in this PR?
Prevent force casting long to int, and throw an exception when overflowing.
Are these changes tested?
Are there any user-facing changes?
No
Closes #2986