Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sql2pgroll package #504

Open
andrew-farries opened this issue Dec 3, 2024 · 0 comments
Open

Implement sql2pgroll package #504

andrew-farries opened this issue Dec 3, 2024 · 0 comments
Assignees
Labels
sql2pgroll Issues relating to the sql2pgroll package

Comments

@andrew-farries
Copy link
Collaborator

andrew-farries commented Dec 3, 2024

Support for converting SQL DDL statements to pgroll migrations is partially complete. This issue tracks progress towards completion.

Support matrix

Operation Status Notes PRs
create_table 🟠 [1] @andrew-farries #502, #546, #547, #548, #549, #550, #552, #553, #554, #555, #556, #557, #559, #560, #561, #563, #585
rename_table 🟢 #513
drop_table 🟢 #529
add_column 🟢 #544, #558, #562
drop_column 🟢 #521
alter_column (rename column) 🟢 #511
alter_column (change type) 🟢 #506, #508
alter_column (set not null) 🟢 #502
alter_column (drop not null) 🟢 #505
alter_column (set_unique) 🟢 #507
alter_column (set comment) n/a [3]
alter_column (set default) 🟢 #526, #535
alter_column (drop default) 🟢 #526
create_index 🟢 #551
drop_index 🟢 #524
rename_constraint 🟢 #514
drop_constraint n/a [2]
set_replica_identity n/a [4]
drop_multicolumn_constraint 🟢 [6] #536
create_unique_constraint 🟢 #507
create_check_constraint 🟠 [5] #538
create_foreign_key_constraint 🟢 #531

Notes

[1]: remaining:

  • multi-column primary keys
  • foreign keys (single and multi-column)
  • check constraints (single and multi-column)
  • default column values
There are also many `CREATE TABLE` options that are not representable by `OpCreateTable`; we need to ensure we fall back to raw SQL if any such options are present.

[2] :drop_constraint is deprecated, sql2pgroll should generate only drop_multicolumn_constraint operations.

[3]: we probably don't want to support this; the default behaviour of generating a raw SQL operation is sufficient for this

[4]: the operation is deprecated, falling back to raw sql for the `SET REPLICA IDENTITY` syntax is acceptable.

[5]: Conversion of `ALTER TABLE ... ADD CONSTRAINT ... CHECK` statements currently uses placeholders for the names of the columns covered by the constraint and for keys (column names) in the `up`/`down` data migration map. We should investigate whether it is possible to walk the AST for the `CHECK` expression (possibly using reflection) to discover the names of the covered columns instead.

[6]: Dropping constraints are tricky because they may affect more than a single column. When creating a pgroll operation to do this (`OpDropMultiColumnConstraint`) we need to specify all the affected columns so that we can create the required triggers. A simple `DROP CONSTRAINT` expression does not contain enough context to allow us to infer the affected columns.

@andrew-farries andrew-farries added the sql2pgroll Issues relating to the sql2pgroll package label Dec 3, 2024
andrew-farries added a commit that referenced this issue Dec 3, 2024
Add a `sql2pgroll` package to convert SQL to `pgroll` migrations.

Add a (hidden for now) `pgroll sql` command that uses the package to
convert SQL strings on the command line to `pgroll` migrations.

The `sql2pgroll` package is incomplete, with almost all SQL falling back
to conversion using raw SQL migrations. Only some `CREATE TABLE`
statements and the `ALTER TABLE ... ALTER COLUMN ... SET NOT NULL`
statement are currently handled.

```bash
$ pgroll sql "create table foo(a serial primary key, b text unique)" 
```

```json
[
  {
    "create_table": {
      "columns": [
        {
          "name": "a",
          "pk": true,
          "type": "serial"
        },
        {
          "name": "b",
          "nullable": true,
          "type": "text",
          "unique": true
        }
      ],
      "name": "foo"
    }
  }
]
```

Part of #504
andrew-farries added a commit that referenced this issue Dec 3, 2024
Convert SQL statements of the form:

```sql
ALTER TABLE foo ALTER COLUMN a [SET DATA] TYPE text
```

to the equivalent `pgroll` migration:

```json
[
  {
    "alter_column": {
      "column": "a",
      "down": "TODO: Implement SQL data migration",
      "table": "foo",
      "type": "text",
      "up": "TODO: Implement SQL data migration"
    }
  }
]
```

Part of #504
andrew-farries added a commit that referenced this issue Dec 4, 2024
…l` operation (#507)

Convert SQL DDL of the form:

```sql
"ALTER TABLE foo ADD CONSTRAINT bar UNIQUE (a)"
```

To the equivalent `pgroll` operation:

```json
[
  {
    "create_constraint": {
      "type": "unique",
      "table": "foo",
      "name": "bar",
      "columns": ["a"],
      "up": {
        "a": "...",
      },
      "down": {
        "a": "..."
      }
    }
  }
]
```

We need to be conservative when converting SQL statements to `pgroll`
operations to ensure that information present in the SQL is not lost
during the conversion.

There are several options possible as part of `ADD CONSTRAINT ...
UNIQUE` statements that aren't currently representable by the
`OpCreateConstraint` operation, for example:

```sql
ALTER TABLE foo ADD CONSTRAINT bar UNIQUE NULLS NOT DISTINCT (a)
ALTER TABLE foo ADD CONSTRAINT bar UNIQUE (a) INCLUDE (b)
```

In these cases we must resort to converting to an `OpRawSQL`. 

Tests are added to cover these unrepresentable cases.

Part of #504
andrew-farries added a commit that referenced this issue Dec 4, 2024
…presentable (#508)

Ensure that SQL statements of the form:

```sql
ALTER TABLE foo ALTER COLUMN a SET DATA TYPE text COLLATE "en_US"
ALTER TABLE foo ALTER COLUMN a SET DATA TYPE text USING 'foo'
```

are converted to raw SQL operations instead of an `OpAlterColumn`
operation. The change of collation is not currently representable by an
`OpAlterColumn` operation and neither is the `USING` clause.

Part of #504
andrew-farries added a commit that referenced this issue Dec 5, 2024
Convert SQL statements of the form:

```sql
ALTER TABLE foo RENAME COLUMN a TO b
ALTER TABLE foo RENAME a TO b
```

to the equivalent `OpAlterColumn` operation:

```json
[
  {
    "alter_column": {
      "table": "foo",
      "column": "a",
      "name": "b"
    }
  }
]
```

Part of #504
andrew-farries added a commit that referenced this issue Dec 5, 2024
* Rename the test variables to match their operation type.
* Rename files to match the operation type.

Part of #504
andrew-farries added a commit that referenced this issue Dec 5, 2024
Convert SQL statements of the form:

```sql
ALTER TABLE foo RENAME TO bar
```

to the corresponding `OpRenameTable` operation:

```json
[
  {
    "rename_table": {
      "from": "foo",
      "to": "bar"
    }
  }
]
```

Part of #504
andrew-farries added a commit that referenced this issue Dec 5, 2024
Convert SQL statements like:

```sql
ALTER TABLE foo RENAME CONSTRAINT bar TO baz
```

to an `OpRenameConstraint` operation:

```json
[
  {
    "rename_constraint": {
      "table": "foo",
      "from": "bar",
      "to": "baz"
    }
  }
]
```

Part of #504
ryanslade added a commit that referenced this issue Dec 11, 2024
ryanslade added a commit that referenced this issue Dec 11, 2024
Converts SQL in the following forms to the equivalent pgroll operation:

```sql
DROP INDEX foo
DROP INDEX schema.foo
DROP INDEX foo RESTRICT
DROP INDEX CONCURRENTLY foo
DROP INDEX IF EXISTS foo
```

The following forms are left as raw SQL operations since we do not
support them in pgroll yet:

```sql
DROP INDEX foo CASCADE
```

Part of #504
ryanslade added a commit that referenced this issue Dec 12, 2024
Converts `DROP TABLE` statements in these forms:

```sql
DROP TABLE foo
DROP TABLE foo RESTRICT
DROP TABLE foo.bar
DROP TABLE IF EXISTS foo
```

These forms fall back to raw SQL:

```sql
DROP TABLE foo CASCADE
```

Part of #504
ryanslade added a commit that referenced this issue Dec 13, 2024
…eign keys (#531)

Supports translating statements in the following forms:

```sql
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d)
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON DELETE NO ACTION
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON DELETE RESTRICT
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON DELETE SET DEFAULT
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON DELETE SET NULL
ALTER TABLE foo ADD CONSTRAINT fk_bar_c FOREIGN KEY (a) REFERENCES bar (c)
ALTER TABLE schema.foo ADD CONSTRAINT fk_bar_c FOREIGN KEY (a) REFERENCES schema.bar (c)
```

The following fall back to raw SQL:

```sql
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON UPDATE RESTRICT;
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON UPDATE CASCADE;
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON UPDATE SET NULL;
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) ON UPDATE SET DEFAULT;
ALTER TABLE foo ADD CONSTRAINT fk_bar_cd FOREIGN KEY (a, b) REFERENCES bar (c, d) MATCH FULL;
```

Part of #504
andrew-farries added a commit that referenced this issue Dec 16, 2024
Switch the `pg_query_go` dependency from
https://github.com/pganalyze/pg_query_go to the fork at
https://github.com/xataio/pg_query_go

The forked version adds a new `DeparseExpr` function for deparsing
expression nodes (see xataio/pg_query_go#1).
This will expand the range of SQL DDL statements that `sql2pgroll` is
able to convert to `pgroll` migrations.

Part of #504
andrew-farries added a commit that referenced this issue Dec 16, 2024
Deparse column `DEFAULT` expressions using `pg_query_go.DeparseExpr` to
allow conversion of arbitrary expressions in column `DEFAULT`s to
`pgroll` migrations.

Replace manual deparsing of literal nodes with use of `DeparseExpr` to
allow deparsing of any column `DEFAULT` expression.

Update test cases:
* `ALTER TABLE foo ALTER COLUMN bar SET DEFAULT now()` is now
representable as an `OpAlterColumn`; no need to fall back to raw SQL
anymore.
* Add a new testcase: `ALTER TABLE foo ALTER COLUMN bar SET DEFAULT
(first_name || ' ' || last_name)` to demonstrate conversion of a column
`DEFAULT` using a more complex expression.

Part of #504
ryanslade added a commit that referenced this issue Dec 16, 2024
…ns (#536)

Convert `DROP CONSTRAINT SQL` into an `OpDropMultiColumnConstraint`.
Because we are unable to infer the columns involved, placeholder
migrations are used.

SQL statements like the following are supported:

```sql
ALTER TABLE foo DROP CONSTRAINT constraint_foo
ALTER TABLE foo DROP CONSTRAINT IF EXISTS constraint_foo
ALTER TABLE foo DROP CONSTRAINT IF EXISTS constraint_foo RESTRICT
```

`CASCADE` is currently not supported and will fall back to raw SQL

Part of #504
andrew-farries added a commit that referenced this issue Dec 18, 2024
There are [many
options](https://www.postgresql.org/docs/current/sql-createtable.html)
for the `CREATE TABLE` statement in Postgres, most of which are not
currently representable by the `pgroll` `OpCreateTable` operation.

Add tests to ensure that `sql2proll.Convert`falls back to raw SQL
operations when these unconvertible options are present in a SQL
statement.

Part of #504
andrew-farries added a commit that referenced this issue Dec 18, 2024
Build on #546 and ensure that more `CREATE TABLE` statements with
options and clauses that are not representable as `pgroll`
`OpCreateTable` operations fall back to raw SQL.

Part of #504
andrew-farries added a commit that referenced this issue Dec 18, 2024
Column definitions in `CREATE TABLE` statements offer [several
options](https://www.postgresql.org/docs/current/sql-createtable.html),
most of which aren't currently representable by `pgroll` `Column`
definitions.

Add tests and code to ensure that `CREATE TABLE` statements containing
columns that use any of these unrepresentable options fall back to raw
SQL operations.

Part of #504
ryanslade added a commit that referenced this issue Dec 18, 2024
Converts `ALTER TABLE ADD COLUMN` statements into `pgroll` operations.
There are many combinations, at the moment we support the following:

```sql
ALTER TABLE foo ADD COLUMN bar int
ALTER TABLE foo ADD COLUMN bar int NOT NULL
ALTER TABLE schema.foo ADD COLUMN bar int
ALTER TABLE foo ADD COLUMN bar int DEFAULT 123
ALTER TABLE foo ADD COLUMN bar int DEFAULT 'baz'
ALTER TABLE foo ADD COLUMN bar int DEFAULT null
ALTER TABLE foo ADD COLUMN bar int NULL
ALTER TABLE foo ADD COLUMN bar int UNIQUE
ALTER TABLE foo ADD COLUMN bar int UNIQUE NOT DEFERRABLE
ALTER TABLE foo ADD COLUMN bar int UNIQUE INITIALLY IMMEDIATE
ALTER TABLE foo ADD COLUMN bar int PRIMARY KEY
ALTER TABLE foo ADD COLUMN bar int CHECK (bar > 0)
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT check_bar CHECK (bar > 0)
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar)
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON UPDATE NO ACTION
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON DELETE NO ACTION
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON DELETE RESTRICT
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON DELETE SET NULL 
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON DELETE SET DEFAULT
ALTER TABLE foo ADD COLUMN bar int CONSTRAINT fk_baz REFERENCES baz (bar) ON DELETE CASCADE
```

And fall back to raw SQL for these cases:

```sql
ALTER TABLE foo ADD COLUMN bar int REFERENCES bar (c) ON UPDATE RESTRICT
ALTER TABLE foo ADD COLUMN bar int REFERENCES bar (c) ON UPDATE CASCADE
ALTER TABLE foo ADD COLUMN bar int REFERENCES bar (c) ON UPDATE SET NULL
ALTER TABLE foo ADD COLUMN bar int REFERENCES bar (c) ON UPDATE SET DEFAULT
ALTER TABLE foo ADD COLUMN IF NOT EXISTS bar int
ALTER TABLE foo ADD COLUMN bar int UNIQUE DEFERRABLE
ALTER TABLE foo ADD COLUMN bar int UNIQUE INITIALLY DEFERRED
ALTER TABLE foo ADD COLUMN bar int GENERATED BY DEFAULT AS IDENTITY
ALTER TABLE foo ADD COLUMN bar int GENERATED ALWAYS AS ( 123 ) STORED
```

Part of #504
@kvch kvch self-assigned this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql2pgroll Issues relating to the sql2pgroll package
Projects
None yet
Development

No branches or pull requests

3 participants