Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: persistence layer clean-up #3835

Merged
merged 28 commits into from
Jan 10, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e3303dd
refactor: persistence layer clean-up
nayib-jose-gloria Dec 23, 2022
e21c59f
lint
nayib-jose-gloria Dec 23, 2022
04668e6
update missing id changes
nayib-jose-gloria Dec 23, 2022
3b2f18d
lint fix
nayib-jose-gloria Dec 23, 2022
04c4dc1
more missing id's
nayib-jose-gloria Dec 23, 2022
ad3a3c5
Merge branch 'main' into nayib/persistence-layer-clean-up
nayib-jose-gloria Dec 23, 2022
6ba491e
remove unused import
nayib-jose-gloria Dec 23, 2022
4332f63
correct id name
nayib-jose-gloria Dec 23, 2022
bc0cceb
Merge branch 'nayib/persistence-layer-clean-up' of https://github.com…
nayib-jose-gloria Dec 23, 2022
b6184b9
lint fix
nayib-jose-gloria Dec 23, 2022
006d4ff
id fix
nayib-jose-gloria Dec 23, 2022
d8e4f17
id fix
nayib-jose-gloria Dec 23, 2022
719705b
more ids
nayib-jose-gloria Dec 23, 2022
b938f1f
id fix
nayib-jose-gloria Dec 23, 2022
60c2c07
version id fix
nayib-jose-gloria Dec 23, 2022
e7f2e94
lint fix
nayib-jose-gloria Dec 23, 2022
b8a1dbb
Merge branch 'main' of https://github.com/chanzuckerberg/single-cell-…
nayib-jose-gloria Jan 9, 2023
26cf547
autogenerate migration script for db field changes + update README links
nayib-jose-gloria Jan 9, 2023
3f198ab
update makefile test command documentation
nayib-jose-gloria Jan 9, 2023
f9a8f95
pr feedback on docs
nayib-jose-gloria Jan 9, 2023
f601d68
update migration script + db/local/load-schema + migration docs
nayib-jose-gloria Jan 9, 2023
5ca9ef6
lint fixes
nayib-jose-gloria Jan 10, 2023
eef29f0
remove unused imports
nayib-jose-gloria Jan 10, 2023
df789da
Merge branch 'main' into nayib/persistence-layer-clean-up
nayib-jose-gloria Jan 10, 2023
96d515c
Merge branch 'main' into nayib/persistence-layer-clean-up
nayib-jose-gloria Jan 10, 2023
4872b0a
tabs
nayib-jose-gloria Jan 10, 2023
5972945
Merge branch 'nayib/persistence-layer-clean-up' of https://github.com…
nayib-jose-gloria Jan 10, 2023
31218bc
tabs
nayib-jose-gloria Jan 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions backend/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,11 @@ db/local/load-data:

db/local/load-schema:
# Imports the corpora_dev.sqlc schema (schema ONLY) into the corpora_test database
# Usage: make db/local/load-schema INFILE=<file>
# Usage: make db/local/load-schema INFILE=<file>
$(eval DB_PW = $(shell aws secretsmanager get-secret-value --secret-id corpora/backend/test/database --region us-west-2 | jq -r '.SecretString | match(":([^:]*)@").captures[0].string'))
PGPASSWORD=${DB_PW} pg_restore --schema-only --clean --no-owner --host 0.0.0.0 --dbname corpora $(INFILE)
# Also import alembic schema version
PGPASSWORD=${DB_PW} pg_restore --data-only --table=alembic_version --no-owner --host 0.0.0.0 --dbname corpora $(INFILE)
PGPASSWORD=${DB_PW} pg_restore --schema-only --schema=persistence-schema --clean --no-owner --host 0.0.0.0 --username corpora --dbname corpora $(INFILE)
Copy link
Contributor

@Bento007 Bento007 Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now don't we want both schemas, until we depreciate corpora_orm.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to get it to work without adding this arg, I kept getting strange pg_restore errors. For example:

pg_restore: error: could not execute query: ERROR:  extension "aws_s3" does not exist
Command was: DROP EXTENSION aws_s3;
pg_restore: error: could not execute query: ERROR:  extension "aws_commons" does not exist
Command was: DROP EXTENSION aws_commons;

I figured limiting this to the persistence_schema is fine, given that this command is only used to load the schema into the test DB to detect diffs after a test migration. By the time this is in main, we won't have plans to support schema changes in corpora_orm anymore. And if we have to revert for whatever reason, this would be reverted as well.

# Also import alembic schema version
PGPASSWORD=${DB_PW} pg_restore --data-only --table=alembic_version --no-owner --host 0.0.0.0 --username corpora --dbname corpora $(INFILE)

db/dump_schema:
ifeq ($(DEPLOYMENT_STAGE),test)
Expand Down
13 changes: 6 additions & 7 deletions backend/database/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,7 @@ AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} CORPORA_L

The following steps will test that a migration script works on a local database using data downloaded from a deployed database.

1. [Connect to Remote RDS](#connect-to-remote-rds)
2. Open a new terminal and using the same values for `AWS_PROFILE` and `DEPLOYMENT_STAGE`, download the remote dev database schema:
1. Open a new terminal and using the same values for `AWS_PROFILE` and `DEPLOYMENT_STAGE`, download the remote dev database schema:

```shell
cd $REPO_ROOT/backend
Expand All @@ -60,15 +59,15 @@ AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/d

This will download the database to `$REPO_ROOT/backend/corpora_dev.sqlc`.

3. The tunnel to dev should close automatically (but worth verifying `ps ax | grep ssh`)
4. Start the local database environment:
2. The tunnel to dev should close automatically (but worth verifying `ps ax | grep ssh`)
3. Start the local database environment:

```shell
cd $REPO_ROOT
make local-start
```

5. Import the remote database schema into your local database:
4. Import the remote database schema into your local database:

```shell
cd $REPO_ROOT/backend
Expand All @@ -85,7 +84,7 @@ You may need to run this a few times, until there are no significant errors.

- Note: `pg_restore: error: could not execute query: ERROR: role "rdsadmin" does not exist` is not a significant error

6. Run the migration test:
5. Run the migration test:

```shell
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE=test make db/test_migration
Expand All @@ -99,7 +98,7 @@ This test will:
1. Dump the schema (after)
1. Compare the before vs after schemas. These should be identical if the database migration's `upgrade()` and `downgrade()` functions were implemented correctly.

If there are no differences then the test passed. If the test didn't pass, make adjustments to your migration script and restart from step 5. Repeat until there are no errors.
If there are no differences then the test passed. If the test didn't pass, make adjustments to your migration script and restart from step 4. Repeat until there are no errors.

## Connect to Remote RDS

Expand Down
49 changes: 18 additions & 31 deletions backend/database/versions/34_2be441104b48_standardize_db_fields.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,43 +18,30 @@

def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('Collection', sa.Column('tombstone', sa.BOOLEAN(), nullable=True), schema='persistence_schema')
op.drop_column('Collection', 'tombstoned', schema='persistence_schema')
op.add_column('CollectionVersion', sa.Column('id', postgresql.UUID(as_uuid=True), nullable=False), schema='persistence_schema')
op.drop_column('CollectionVersion', 'version_id', schema='persistence_schema')
op.add_column('CollectionVersion', sa.Column('collection_metadata', postgresql.JSON(astext_type=sa.Text()), nullable=True), schema='persistence_schema')
op.drop_column('CollectionVersion', 'metadata', schema='persistence_schema')
op.add_column('Dataset', sa.Column('id', postgresql.UUID(as_uuid=True), nullable=False), schema='persistence_schema')
op.drop_column('Dataset', 'dataset_id', schema='persistence_schema')
op.add_column('Dataset', sa.Column('version_id', postgresql.UUID(as_uuid=True), nullable=True), schema='persistence_schema')
op.drop_column('Dataset', 'dataset_version_id', schema='persistence_schema')
op.add_column('DatasetVersion', sa.Column('id', postgresql.UUID(as_uuid=True), nullable=False), schema='persistence_schema')
op.drop_column('DatasetVersion', 'version_id', schema='persistence_schema')
op.add_column('DatasetVersion', sa.Column('dataset_metadata', postgresql.JSON(astext_type=sa.Text()), nullable=True), schema='persistence_schema')
op.drop_column('DatasetVersion', 'metadata', schema='persistence_schema')
op.alter_column('Collection', "tombstoned", new_column_name="tombstone", schema='persistence_schema')
op.alter_column('CollectionVersion', 'version_id', new_column_name="id", schema='persistence_schema')
op.alter_column('CollectionVersion', 'metadata', new_column_name="collection_metadata", schema='persistence_schema')
op.alter_column('Dataset', 'dataset_id', new_column_name="id", schema='persistence_schema')
op.alter_column('Dataset', 'dataset_version_id', new_column_name="version_id", schema='persistence_schema')
op.alter_column('DatasetVersion', 'version_id', new_column_name="id", schema='persistence_schema')
op.alter_column('DatasetVersion', 'metadata', new_column_name='dataset_metadata', schema='persistence_schema')
op.drop_constraint('DatasetVersion_dataset_id_fkey', 'DatasetVersion', schema='persistence_schema',
type_='foreignkey')
op.create_foreign_key(None, 'DatasetVersion', 'Dataset', ['dataset_id'], ['id'], source_schema='persistence_schema',
op.create_foreign_key('DatasetVersion_id_fkey', 'DatasetVersion', 'Dataset', ['dataset_id'], ['id'], source_schema='persistence_schema',
referent_schema='persistence_schema')
# ### end Alembic commands ###


def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('DatasetVersion', sa.Column('version_id', postgresql.UUID(), autoincrement=False, nullable=False), schema='persistence_schema')
op.add_column('DatasetVersion', sa.Column('metadata', postgresql.JSON(astext_type=sa.Text()), autoincrement=False, nullable=True), schema='persistence_schema')
op.drop_constraint(None, 'DatasetVersion', schema='persistence_schema', type_='foreignkey')
op.create_foreign_key('DatasetVersion_dataset_id_fkey', 'DatasetVersion', 'Dataset', ['dataset_id'], ['dataset_id'], source_schema='persistence_schema', referent_schema='persistence_schema')
op.drop_column('DatasetVersion', 'dataset_metadata', schema='persistence_schema')
op.drop_column('DatasetVersion', 'id', schema='persistence_schema')
op.add_column('Dataset', sa.Column('dataset_id', postgresql.UUID(), autoincrement=False, nullable=False), schema='persistence_schema')
op.add_column('Dataset', sa.Column('dataset_version_id', postgresql.UUID(), autoincrement=False, nullable=True), schema='persistence_schema')
op.drop_column('Dataset', 'version_id', schema='persistence_schema')
op.drop_column('Dataset', 'id', schema='persistence_schema')
op.add_column('CollectionVersion', sa.Column('version_id', postgresql.UUID(), autoincrement=False, nullable=False), schema='persistence_schema')
op.add_column('CollectionVersion', sa.Column('metadata', postgresql.JSON(astext_type=sa.Text()), autoincrement=False, nullable=True), schema='persistence_schema')
op.drop_column('CollectionVersion', 'collection_metadata', schema='persistence_schema')
op.drop_column('CollectionVersion', 'id', schema='persistence_schema')
op.add_column('Collection', sa.Column('tombstoned', sa.BOOLEAN(), autoincrement=False, nullable=True), schema='persistence_schema')
op.drop_column('Collection', 'tombstone', schema='persistence_schema')
op.alter_column('Collection', "tombstone", new_column_name="tombstoned", schema='persistence_schema')
op.alter_column('CollectionVersion', 'id', new_column_name="version_id", schema='persistence_schema')
op.alter_column('CollectionVersion', 'collection_metadata', new_column_name="metadata", schema='persistence_schema')
op.alter_column('Dataset', 'id', new_column_name="dataset_id", schema='persistence_schema')
op.alter_column('Dataset', 'version_id', new_column_name="dataset_version_id", schema='persistence_schema')
op.alter_column('DatasetVersion', 'id', new_column_name="version_id", schema='persistence_schema')
op.alter_column('DatasetVersion', 'dataset_metadata', new_column_name='metadata', schema='persistence_schema')
op.drop_constraint('DatasetVersion_id_fkey', 'DatasetVersion', schema='persistence_schema', type_='foreignkey')
op.create_foreign_key('DatasetVersion_dataset_id_fkey', 'DatasetVersion', 'Dataset', ['dataset_id'], ['dataset_id'], source_schema='persistence_schema',
referent_schema='persistence_schema')
# ### end Alembic commands ###