Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-7303] Fix date field type unexpectedly convert to Long when usi… #10517

Merged
merged 1 commit into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -602,10 +602,10 @@ private static FilterPredicate toParquetPredicate(FunctionDefinition functionDef
case TINYINT:
case SMALLINT:
case INTEGER:
case DATE:
case TIME_WITHOUT_TIME_ZONE:
return predicateSupportsLtGt(functionDefinition, intColumn(columnName), (Integer) literal);
case BIGINT:
case DATE:
case TIMESTAMP_WITHOUT_TIME_ZONE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any exception thrown here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danny0405 ,

As far as I could see there should be no exceptions thrown here. I have gone through all TPCDS queries. They could work well with all patches applied that I sumbitted recently.

If there might be some potential risks for this change, please let me know. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why we need this fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danny0405 ,

Please see: HUDI-7303.

In parquet, date type is stored as INT32 (epoch day). But if we add some conditions with date typed field in SQL where clause, its type will unexpectedly convert to Long.

return predicateSupportsLtGt(functionDefinition, longColumn(columnName), (Long) literal);
case FLOAT:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ public static Object getValueFromLiteral(ValueLiteralExpression expr) {
.orElse(null);
case DATE:
return expr.getValueAs(LocalDate.class)
.map(LocalDate::toEpochDay)
.map(date -> (int) date.toEpochDay())
.orElse(null);
// NOTE: All integral types of size less than Int are encoded as Ints in MT
case BOOLEAN:
Expand Down Expand Up @@ -212,7 +212,7 @@ public static Object getKeyFromLiteral(ValueLiteralExpression expr, boolean logi
case TIMESTAMP_WITHOUT_TIME_ZONE:
return logicalTimestamp ? new Timestamp((long) val) : val;
case DATE:
return LocalDate.ofEpochDay((long) val);
return LocalDate.ofEpochDay((int) val);
default:
return val;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ void getValueFromLiteralForNonNull() {
if (dataList.get(i) instanceof LocalTime) {
assertEquals(((LocalTime) dataList.get(i)).get(ChronoField.MILLI_OF_DAY), ExpressionUtils.getValueFromLiteral((ValueLiteralExpression) childExprs.get(1)));
} else if (dataList.get(i) instanceof LocalDate) {
assertEquals(((LocalDate) dataList.get(i)).toEpochDay(), ExpressionUtils.getValueFromLiteral((ValueLiteralExpression) childExprs.get(1)));
assertEquals((int) ((LocalDate) dataList.get(i)).toEpochDay(), ExpressionUtils.getValueFromLiteral((ValueLiteralExpression) childExprs.get(1)));
} else if (dataList.get(i) instanceof LocalDateTime) {
assertEquals(((LocalDateTime) dataList.get(i)).toInstant(ZoneOffset.UTC).toEpochMilli(), ExpressionUtils.getValueFromLiteral((ValueLiteralExpression) childExprs.get(1)));
} else {
Expand Down
Loading