-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: drop legacy key field functionality #3764
feat: drop legacy key field functionality #3764
Conversation
BREAKING CHANGE This drops functionality that maintained a legacy version of the key field. This legacy version was often wrong and resulted in unnecessary repartition steps being performed. The functionality was previously required to ensure existing legacy queries continued to have the unnecessary repartitions to ensure no data loss. With a major version bump this can now be dropped.
@@ -281,12 +283,6 @@ Joiner getJoiner(final DataSourceType leftType, | |||
final ColumnRef joinFieldName, | |||
final Stacker contextStacker | |||
) { | |||
final LogicalSchema schema = stream.getSchema(); | |||
|
|||
schema.findValueColumn(joinFieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this test as its duplicates one inside selectKey
.
* @param leftKeyField the key field of the left source. | ||
* @return the key field that should be used by the resultant joined stream. | ||
*/ | ||
static KeyField getJoinedKeyField(final SourceName leftAlias, final KeyField leftKeyField) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer needed as the correct keyfield is built in the constructor.
* @param leftKeyField the key field of the left source. | ||
* @return the key field that should be used by the resultant joined stream. | ||
*/ | ||
static KeyField getOuterJoinedKeyField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
getKeyField().legacy() | ||
); | ||
|
||
final SchemaKStream result = schemaKStream.withKeyField(resultKeyField); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withKeyField
has been removed as its no longer needed. The only reason it was needed was to set the legacy (incorrect) keyField. With this PR this all goes away.
return doFindKeyColumn(selectExpressions, keyColumn).map(Column::ref); | ||
} | ||
|
||
private Optional<LegacyField> findLegacyKeyField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has been removed and everything else inlined into findKeyField
@@ -119,7 +119,6 @@ | |||
private static final int MAP_INDEX1 = 11; | |||
private static final int MAP_INDEX2 = 12; | |||
private static final int STRUCT_INDEX = 15; | |||
private static final int DECIMAL_INDEX = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed as unused
@@ -433,150 +433,6 @@ public void shouldHandleRightTableJoiningOnRowKey() { | |||
// Then: did not throw. | |||
} | |||
|
|||
@Test | |||
public void shouldRepartitionLeftStreamIfNotCorrectKey_Legacy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all testing the legacy code path - so removed.
@@ -142,46 +127,29 @@ | |||
Collections.emptyList() | |||
); | |||
|
|||
private final QueryId queryId = new QueryId("source-test"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were all unused, so removed them
@@ -412,25 +378,6 @@ public void shouldBuildTableWithCorrectContext() { | |||
equalTo(ImmutableList.of("0", "reduce"))); | |||
} | |||
|
|||
|
|||
@SuppressWarnings("unchecked") | |||
private DataSourceNode nodeWithMockTableSource() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
@@ -132,4 +136,32 @@ private static ConnectSchema makeTopLevelStructNoneOptional(final Schema schema) | |||
schema.fields().forEach(field -> builder.field(field.name(), field.schema())); | |||
return (ConnectSchema) builder.build(); | |||
} | |||
|
|||
public static class Deserializer extends JsonDeserializer<SourceNode> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the job of this deserializer is basically to distinguish between a source with no key field criteria specified, meaning the key field can be anything vs one where the key field is explicitly set to null
, meaning the source should have no key field set.
I'm not aware of a better way to do this with Jackson...
* It is deliberately excluded from the test suite | ||
*/ | ||
@Ignore | ||
public final class ManualTopologyFileGeneratorTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rolled this into TopologyFileGenerator
...
@@ -51,20 +50,11 @@ Topologies: | |||
--> KSTREAM-MAPVALUES-0000000002 | |||
<-- KSTREAM-SOURCE-0000000000 | |||
Processor: KSTREAM-MAPVALUES-0000000002 (stores: []) | |||
--> KSTREAM-FILTER-0000000003 | |||
--> KSTREAM-MAPVALUES-0000000003 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this topology has changed because the test was explicitly setting ksql.query.fields.key.legacy
to true
. With this setting now gone the unnecessary repartition is avoided.
"valueSchema": "STRUCT<FOO INT, KSQL_COL_1 BIGINT>" | ||
} | ||
] | ||
} | ||
}, | ||
{ | ||
"name": "stream | initially null | group by (-) | key in value | no aliasing | legacy downstream query", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed these legacy downstream queries as the property they use no longer exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left one item inline.
ksql-engine/src/main/java/io/confluent/ksql/ddl/commands/DdlCommandExec.java
Show resolved
Hide resolved
Follows on from confluentinc#3764 See comment: confluentinc#3764 (comment)
With legacy key fields now gone, (confluentinc#3764), we can now drop the final use of `Column.matches`.
With legacy key fields now gone, (#3764), we can now drop the final use of `Column.matches`.
Follows on from #3764 See comment: #3764 (comment)
* feat: drop legacy key field functionality BREAKING CHANGE This drops functionality that maintained a legacy version of the key field. This legacy version was often wrong and resulted in unnecessary repartition steps being performed. The functionality was previously required to ensure existing legacy queries continued to have the unnecessary repartitions to ensure no data loss. With a major version bump this can now be dropped.
Description
BREAKING CHANGE
Fixes ##3350
This drops functionality that maintained a legacy version of the key field.
This legacy version was often wrong and resulted in unnecessary repartition steps being performed.
The functionality was previously required to ensure existing legacy queries continued to have the unnecessary repartitions to ensure no data loss.
With a major version bump this can now be dropped.
Reviewing notes:
KeyField
as this is the crux of the change. The legacy field is no more! Most of the rest of the code changes is removing the legacy field.JoinNode
has received a little bit of love to tidy things up. The correctkeyField
is now constructed in the constructor and used for all joins later on.KsqlStructuredDataOutputNode
has been simplified with the removal ofSchemaKStream.withKeyField
.Testing done
Usual.
Reviewer checklist