Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(persons-on-events): add required person and group columns to events table #9251

Merged
merged 31 commits into from
Apr 13, 2022

Conversation

yakkomajuri
Copy link
Contributor

Problem

#9178

Changes

I was thinking about this for a bit and think this might be cleaner than updating the sharding migration, while also unlocking this schema change earlier in the process.

The idea here is to alter events to add the columns and if the setup is sharded alter those tables too.

Consider the scenarios:

User is on old schema, CLICKHOUSE_REPLICATION is off (for whatever reason)

We'll alter events, they'll get the new columns, and whenever they run the sharding migration the new tables will have the columns too.

User is on old schema, CLICKHOUSE_REPLICATION is on (but 0004_replicated_schema hasn't run)

We'll alter events, see that writable_events doesn't exist, and call it a day. When they run the migration the tables will have all the columns.

User is on new schema

We alter the sharded tables to get the columns.

Fresh install

They have the new columns from 0001, so 0025 will complete successfully because of IF NOT EXISTS when adding the columns.

How did you test this code?

Manually ran the migrations

@yakkomajuri yakkomajuri changed the title Events table new schema feat(persons-on-events): add required person and group columns to events table Mar 25, 2022
@yakkomajuri yakkomajuri requested a review from macobo March 25, 2022 10:02
@macobo
Copy link
Contributor

macobo commented Mar 25, 2022

Approach looks great - had some code-level issues!

@macobo
Copy link
Contributor

macobo commented Mar 25, 2022

Holding off on approve due to test failures (probably due to conflicting column names) but this lgtm :)

@yakkomajuri
Copy link
Contributor Author

oof yeah the new column name conflicts on joins

@yakkomajuri
Copy link
Contributor Author

I haven't worked on the queries side so help me out here:

The lazy thing to do is name this something else like personid and move on. Else I can go in and change all queries joining the pdi table to SELECT pdi.person_id as person_id for now.

I'm inclined to just be lazy here, but wondering if that'll lead to some level of confusion for people writing queries given the mismatch in column names + if I'd be breaking best practices for highlighting relations in a non-relational db

@macobo

@yakkomajuri
Copy link
Contributor Author

There's so much formatting, dynamic generation, etc of SQL that it would be hard for me to track down all instances of the join as someone who hasn't touched these. This PR would also suddenly get a bit more dangerous as it would touch multiple areas.

Definitely won't make decisions because "laziness" but just wondering if it would really harm anyone to have that column be called e.g. personid :D

@macobo
Copy link
Contributor

macobo commented Mar 25, 2022

The lazy thing to do is name this something else like personid and move on. Else I can go in and change all queries joining the pdi table to SELECT pdi.person_id as person_id for now.

I think the other stakeholder here is @EDsCODE rather than me right now. Bring this up in the sync with him!

Base automatically changed from nuke-protobuf-pt1 to master April 4, 2022 17:29
@EDsCODE
Copy link
Member

EDsCODE commented Apr 4, 2022

I believe everything on the query end is ok! @tiina303 @yakkomajuri

@yakkomajuri
Copy link
Contributor Author

Thanks a lot - changes lgtm @EDsCODE. Care to stamp this if the rest looks ok?

@yakkomajuri
Copy link
Contributor Author

@tiina303 maybe give this a quick look as well?

@yakkomajuri yakkomajuri requested a review from tiina303 April 5, 2022 10:23
Copy link
Contributor

@tiina303 tiina303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@tiina303
Copy link
Contributor

tiina303 commented Apr 5, 2022

I'm not that worried about the querying side - if we mess up there some query won't work and we can hot fix it (on that note we should keep an eye on support requests, failures in sentry), if we mess up something on the table/migration side that could be trickier to resolve.

@yakkomajuri
Copy link
Contributor Author

I'm holding off on merging this until #9207 is fully sorted. #9368 should be the missing piece

@yakkomajuri
Copy link
Contributor Author

Actually this migration also needs to update the kafka and MV tables

@yakkomajuri yakkomajuri marked this pull request as draft April 12, 2022 15:07
@yakkomajuri yakkomajuri marked this pull request as ready for review April 12, 2022 18:12
@yakkomajuri yakkomajuri merged commit 3d71ad0 into master Apr 13, 2022
@yakkomajuri yakkomajuri deleted the events-table-new-schema branch April 13, 2022 10:48
yakkomajuri added a commit that referenced this pull request Apr 13, 2022
yakkomajuri added a commit that referenced this pull request Apr 13, 2022
@tiina303 tiina303 restored the events-table-new-schema branch April 13, 2022 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants