Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null records in windowed aggregation #2978

Closed
decoodtw opened this issue Jun 14, 2019 · 1 comment
Closed

Null records in windowed aggregation #2978

decoodtw opened this issue Jun 14, 2019 · 1 comment

Comments

@decoodtw
Copy link

decoodtw commented Jun 14, 2019

Hi everyone,
The following is stated in the KSQL docs:
In a windowed aggregation, the first seen message is written into the table for a particular key as a null. Downstream applications reading the data will see nulls, and if an application can't handle null values, it may need a separate stream that filters these null records with a WHERE clause.

I want to filter out the null values for the following FMS_TABLE table:
CREATE TABLE FMS_TABLE AS SELECT user, SUM(transaction) as balance, WINDOWSTART() as sessionStart FROM FMS_STREAM WINDOW SESSION (60 SECONDS) GROUP BY user;
Via the following table:
CREATE TABLE FMS_BALANCE AS SELECT * FROM FMS_TABLE where user IS NOT NULL;
But I'm still seeing the 'null' records generated in FMS_BALANCE table.
Thanks,
Wim

@big-andy-coates
Copy link
Contributor

Hi @decoodtw,

So the reason you're seeing null values in the first place is because of this issue: #3558, plus vote on that issue if you want to see it fixed.

The reason your second statement does not filter out the nulls is because its a 'table' and tables pass through tombstones, (null values).

The work around you need is to re-import the FMS_TABLE as a stream and then filter that. Streams do not have special handling for null values.

e.g.

CREATE STREAM FMS_TABLE_S ( schema here) WITH (kafka_topic='FMS_TABLE', ...);
CREATE STREAM FMS_FILTERED AS SELECT * FROM FMS_TABLE_S WHERE USER ID NOT NULL;

I appreciate this is a pain and you shouldn't need to do this - so stick a thumbs up vote on #3558

In the interest of keeping our issue backlog tidy I'm going to close this issue. Please feel free to reopen with more details if the answer does not fully answer your question of if you have a follow up question. Thanks for using KSQL!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants