-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(batch-exports): Add created_at
to Redshift persons batch export
#27403
feat(batch-exports): Add created_at
to Redshift persons batch export
#27403
Conversation
|
||
async with self.connection.transaction(): | ||
async with self.connection.cursor() as cursor: | ||
await cursor.execute(sql.SQL("SELECT * FROM {} WHERE 1=0").format(table_identifier)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very funny way to get schema information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was what my editor suggested 😄
I suppose it is efficient since it won't actually return any table data, just column metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit obfuscated, maybe not so clear at a glace what you are doing, but I believe it does work. And it saves you from needing to query a separate table (something in information_schema
) for which we may need extra permissions.
# Redshift doesn't support adding a condition on the merge, so we have | ||
# to first delete any rows in stage that match those in final, where | ||
# stage also has a higher version. Otherwise we risk merging adding old | ||
# versions back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was slightly confused why we were doing this and then saw this comment in the PR where you introduced it, so thought it was worth adding
Problem
We're not currently sending the
created_at
field in Redshift persons batch exportsChanges
created_at
field👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
Yes
How did you test this code?
Added new tests