-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit keys #5533
Explicit keys #5533
Conversation
implements: [KLIP-29](confluentinc#5530) fixes: confluentinc#5303 fixes: confluentinc#4678 This change sees ksqlDB no longer adding an implicit `ROWKEY STRING` key column to created streams or primary key column to created tables when no key column is explicitly provided in the `CREATE` statement. BREAKING CHANGE `CREATE TABLE` statements will now fail if not `PRIMARY KEY` column is provided. For example, a statement such as: ```sql CREATE TABLE FOO (name STRING) WITH (kafka_topic='foo', value_format='json'); ``` Will need to be updated to include the definition of the PRIMARY KEY, e.g. ```sql CREATE TABLE FOO (ID STRING PRIMARY KEY, name STRING) WITH (kafka_topic='foo', value_format='json'); ``` If using schema inference, i.e. loading the value columns of the topic from the Schema Registry, the primary key can be provided as a partial schema, e.g. ```sql -- FOO will have value columns loaded from the Schema Registry CREATE TABLE FOO (ID INT PRIMARY KEY) WITH (kafka_topic='foo', value_format='avro'); ``` `CREATE STREAM` statements that do not define a `KEY` column will no longer have an implicit `ROWKEY` key column. For example: ```sql CREATE STREAM BAR (NAME STRING) WITH (...); ``` The above statement would previously have resulted in a stream with two columns: `ROWKEY STRING KEY` and `NAME STRING`. With this change the above statement will result in a stream with only the `NAME STRING` column. Streams will no KEY column will be serialized to Kafka topics with a `null` key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, I have some minor testing comments - though I'm a little nervous putting this in last minute to a release without extensive testing, I have confidence in our QTTs 😅
ksqldb-functional-tests/src/test/resources/query-validation-tests/elements.json
Outdated
Show resolved
Hide resolved
@@ -1966,6 +1966,26 @@ | |||
"outputs": [ | |||
{"topic": "OUTPUT", "key": "user_0", "value": {"IMPRESSION_ID": 24, "URL": "urlA"}, "timestamp": 12} | |||
] | |||
}, | |||
{ | |||
"name": "streams with no key columns", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont want to be too picky, but let's also add a stream no-key --> table join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* feat: explicit keys implements: [KLIP-29](#5530) fixes: #5303 fixes: #4678 This change sees ksqlDB no longer adding an implicit `ROWKEY STRING` key column to created streams or primary key column to created tables when no key column is explicitly provided in the `CREATE` statement. BREAKING CHANGE `CREATE TABLE` statements will now fail if not `PRIMARY KEY` column is provided. For example, a statement such as: ```sql CREATE TABLE FOO (name STRING) WITH (kafka_topic='foo', value_format='json'); ``` Will need to be updated to include the definition of the PRIMARY KEY, e.g. ```sql CREATE TABLE FOO (ID STRING PRIMARY KEY, name STRING) WITH (kafka_topic='foo', value_format='json'); ``` If using schema inference, i.e. loading the value columns of the topic from the Schema Registry, the primary key can be provided as a partial schema, e.g. ```sql -- FOO will have value columns loaded from the Schema Registry CREATE TABLE FOO (ID INT PRIMARY KEY) WITH (kafka_topic='foo', value_format='avro'); ``` `CREATE STREAM` statements that do not define a `KEY` column will no longer have an implicit `ROWKEY` key column. For example: ```sql CREATE STREAM BAR (NAME STRING) WITH (...); ``` The above statement would previously have resulted in a stream with two columns: `ROWKEY STRING KEY` and `NAME STRING`. With this change the above statement will result in a stream with only the `NAME STRING` column. Streams will no KEY column will be serialized to Kafka topics with a `null` key. Co-authored-by: Andy Coates <[email protected]>
Description
implements: KLIP-29
fixes: #5303
fixes: #4678
This change sees ksqlDB no longer adding an implicit
ROWKEY STRING
key column to created streams or primary key column to created tables when no key column is explicitly provided in theCREATE
statement.BREAKING CHANGE
CREATE TABLE
statements will now fail if notPRIMARY KEY
column is provided.For example, a statement such as:
Will need to be updated to include the definition of the PRIMARY KEY, e.g.
If using schema inference, i.e. loading the value columns of the topic from the Schema Registry, the primary key can be provided as a partial schema, e.g.
CREATE STREAM
statements that do not define aKEY
column will no longer have an implicitROWKEY
key column.For example:
The above statement would previously have resulted in a stream with two columns:
ROWKEY STRING KEY
andNAME STRING
.With this change the above statement will result in a stream with only the
NAME STRING
column.Streams will no KEY column will be serialized to Kafka topics with a
null
key.Testing done
usual
Reviewer checklist