-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operations can refer to previous ones in a migration file #441
Conversation
@@ -0,0 +1,36 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's roughly a good direction. I've written some thoughts on doing this as part of a previous WIP attempt to do this.
Making Start, Complete, Rollback, and Validate work
Making Start
operations aware of schema changes made by previous migrations is fairly easy; they just need to ensure they lookup any table or column names in s.Tables
rather than using the name directly.
Making Validate
work is harder. Validate
could take a schema and update it in the same way that the Start
operations do. Then Validate
could be made indirection-aware in the same was as the Start
methods; looking up all names in s.Tables
. Alternatively,Validate
could be changed to lookup either final or temporary versions of each name.
Complete
works fine as is; the operations are completed in order so any tables/columns have the final name expected by a subsequent operation.
Not sure about Rollback
. Operations would have to be run in reverse order, at least.
Making backfills work
Operations that cause backfills:
- add column
- drop constraint
- alter column
Beware of double-backfills.
- Two ops that duplicate the same column - the second op will try to duplicate the duplicated column because of the name mapping in the virtual schema.
- Two ops that both create triggers for the same column will fail because they will both try to create a trigger with the same name.
- Maybe we choose to not support two ops that backfill the same column in the same migration.
Virtual schema
We take the information about the columns from the virtual schema that is built up by each successive Start
operation. This virtual schema is a mix of the schema that was retrieved from the database and the schema that is being built up by the operations that have been run so far. This schema contains limited information about new tables and columns - just their names and, for a table, the names of each of its columns.
Restrictions:
- Can't alter a column that in a table that was created in the same migration, or a column in an existing table that was added by an 'add column' operation.
- New columns have very limited information about them in the virtual schema, no type etc so they can't be duplicated, and there won't be enough info about PKs, UNIQUE etc to be able to perform a backfill.
- You can add a new column to a table created in an earlier op, but backfill won't be possible, ie you can't specify 'up'
- Again, the new table doesn't have enough into about it in the virtual schema to be able to perform a backfill.
Is it feasible to refresh the schema after each operation from the database?
- This would overwrite any changes made by the operations so far, eg the mapping of old names to new names.
- Could build a 'schema overlay' that contains the indirections made by the operations so far and put that over the schema retrieved from the database.
It's either that or put enough info about new tables and columns in the virtual schema manually to be able to perform backfills and duplications.
A more incremental approach is in #449. Going one operation at a time, ensuring it can be started and validated when run on objects created in previous operations. |
Closing in favour of the incremental approached kicked off by @andrew-farries |
This draft PR introduces the idea I had for letting operations refer to previously created resources in the same migration file. It is not yet complete, but I am putting it here to discuss whether this approach is acceptable to everyone.
I introduced two changes to the existing migration process:
Operation
namedDeriveSchema
. It adds the newly introduced resources to the active schema.The consequences of these changes are the following:
newSchema
is no longer the "live" schema in the database. DuringValidate
andStart
phase it can contain resources that are not yet created. At this point, I do not see the problem with this.DeriveSchema
.Closes #203
Closes #239