-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(persons-on-events): add required person and group columns to events table #9251
Conversation
…into clickhouse [nuke protobuf pt. 1]
posthog/async_migrations/test/__snapshots__/test_0004_replicated_schema.ambr
Outdated
Show resolved
Hide resolved
Approach looks great - had some code-level issues! |
Holding off on approve due to test failures (probably due to conflicting column names) but this lgtm :) |
oof yeah the new column name conflicts on joins |
I haven't worked on the queries side so help me out here: The lazy thing to do is name this something else like I'm inclined to just be lazy here, but wondering if that'll lead to some level of confusion for people writing queries given the mismatch in column names + if I'd be breaking best practices for highlighting relations in a non-relational db |
There's so much formatting, dynamic generation, etc of SQL that it would be hard for me to track down all instances of the join as someone who hasn't touched these. This PR would also suddenly get a bit more dangerous as it would touch multiple areas. Definitely won't make decisions because "laziness" but just wondering if it would really harm anyone to have that column be called e.g. |
I think the other stakeholder here is @EDsCODE rather than me right now. Bring this up in the sync with him! |
I believe everything on the query end is ok! @tiina303 @yakkomajuri |
Thanks a lot - changes lgtm @EDsCODE. Care to stamp this if the rest looks ok? |
@tiina303 maybe give this a quick look as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
I'm not that worried about the querying side - if we mess up there some query won't work and we can hot fix it (on that note we should keep an eye on support requests, failures in sentry), if we mess up something on the table/migration side that could be trickier to resolve. |
Actually this migration also needs to update the kafka and MV tables |
Problem
#9178
Changes
I was thinking about this for a bit and think this might be cleaner than updating the sharding migration, while also unlocking this schema change earlier in the process.
The idea here is to alter
events
to add the columns and if the setup is sharded alter those tables too.Consider the scenarios:
User is on old schema, CLICKHOUSE_REPLICATION is off (for whatever reason)
We'll alter
events
, they'll get the new columns, and whenever they run the sharding migration the new tables will have the columns too.User is on old schema, CLICKHOUSE_REPLICATION is on (but 0004_replicated_schema hasn't run)
We'll alter
events
, see thatwritable_events
doesn't exist, and call it a day. When they run the migration the tables will have all the columns.User is on new schema
We alter the sharded tables to get the columns.
Fresh install
They have the new columns from 0001, so 0025 will complete successfully because of
IF NOT EXISTS
when adding the columns.How did you test this code?
Manually ran the migrations