Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batch-exports): Add created_at to Redshift persons batch export #27403

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

rossgray
Copy link
Contributor

@rossgray rossgray commented Jan 9, 2025

Problem

We're not currently sending the created_at field in Redshift persons batch exports

Changes

  • Add the created_at field
  • Update README to include info about Tailscale

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

Added new tests

@rossgray rossgray requested a review from tomasfarias January 9, 2025 15:27

async with self.connection.transaction():
async with self.connection.cursor() as cursor:
await cursor.execute(sql.SQL("SELECT * FROM {} WHERE 1=0").format(table_identifier))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very funny way to get schema information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was what my editor suggested 😄
I suppose it is efficient since it won't actually return any table data, just column metadata

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit obfuscated, maybe not so clear at a glace what you are doing, but I believe it does work. And it saves you from needing to query a separate table (something in information_schema) for which we may need extra permissions.

Comment on lines +123 to +126
# Redshift doesn't support adding a condition on the merge, so we have
# to first delete any rows in stage that match those in final, where
# stage also has a higher version. Otherwise we risk merging adding old
# versions back.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was slightly confused why we were doing this and then saw this comment in the PR where you introduced it, so thought it was worth adding

@rossgray rossgray merged commit 4c3cb69 into master Jan 10, 2025
93 checks passed
@rossgray rossgray deleted the add-created-at-to-redshift-persons-batch-export branch January 10, 2025 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants