You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given a C*AS statement that contains a GROUP BY clause that is anything more than a simple column reference AND where a matching expression can be found in the projection THEN the key field of the resultant source stored in the metastore is incorrect.
The net result of this is we miss an opportunity to avoid a repartition step on downstream queries that require the source to be (re-)keyed on the field.
e.g.
Given the SQL statement:
CREATETABLEOUTPUTASSELECT foo + bar, COUNT(*) FROM INPUT GROUP BY foo + bar;
Expected result: The keyField of OUTPUT should be set to KSQL_COL_0, i.e. the name of the first column in the projection, as this matches the GROUP BY clause.
Actual result: the keyField of OUTPUT is set to null.
The same is true with aliased projection fields too:
Given the SQL statement:
CREATETABLEOUTPUTASSELECT CAST(ID AS STRING) AS PRODUCT_ID, COUNT(*) FROM INPUT GROUP BY CAST(ID AS STRING);
Expected result: The keyField of OUTPUT should be set to PRODUCT_ID, i.e. the name of the first column in the projection, as this matches the GROUP BY clause.
Actual result: the keyField of OUTPUT is set to null.
The text was updated successfully, but these errors were encountered:
@big-andy-coates are we sure we want this behavior? I think with #3982 this is no longer desired - they are two different fields (one key field, one is a value field).
Given a C*AS statement that contains a
GROUP BY
clause that is anything more than a simple column reference AND where a matching expression can be found in the projection THEN the key field of the resultant source stored in the metastore is incorrect.The net result of this is we miss an opportunity to avoid a repartition step on downstream queries that require the source to be (re-)keyed on the field.
e.g.
Given the SQL statement:
Expected result: The
keyField
ofOUTPUT
should be set toKSQL_COL_0
, i.e. the name of the first column in the projection, as this matches theGROUP BY
clause.Actual result: the
keyField
ofOUTPUT
is set tonull
.The same is true with aliased projection fields too:
Given the SQL statement:
Expected result: The
keyField
ofOUTPUT
should be set toPRODUCT_ID
, i.e. the name of the first column in the projection, as this matches theGROUP BY
clause.Actual result: the
keyField
ofOUTPUT
is set tonull
.The text was updated successfully, but these errors were encountered: