Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulties integrating into project #446

Open
ericwooley opened this issue Nov 1, 2024 · 9 comments
Open

Difficulties integrating into project #446

ericwooley opened this issue Nov 1, 2024 · 9 comments
Assignees
Milestone

Comments

@ericwooley
Copy link

Hello!

I love the project and the goal.

However, as it stands, it's pretty hard to integrate into a project with multiple branches and developers.

Presumably, each developer, working on different branches would have different migration files. Something like this

migrations
├── create_book.json             # dev 1 on branch book which was branched off main
├── create_checkouts.json.   # dev 2 on branch checkouts which was branched off of book
└── create_library.json          # dev 3 on branch library which was branched off main

When they all get merged it's hard to know whats been merged, and in what order to run them.

i've resorted to scripting the creation of migrations so that they have time stamps, so they run in order of the time the developer created them, which seems to work ok, but then the migration script on pull down needs to figure out which migrations to run, and which ones not to run for this environment, and there doesn't seem to be a way to check that using the CLI. Currently I'm having to resort to querying the migrations table myself to see what needs to be run.

Another option i considered was storing individual operations in a folder, based on release, but this also gets hard to manage as different releases might happen with different features etc.... Which would require moving files around as part of the release process.

Guidance on this would be helpful, or possibly allowing a status query on individual migrations files, so that I can check if each migration has already been run, and skip it if so.

I'm using this for a typical SASS deployment process for now, so most of the time I will just be running pgroll start <file> --complete. But I like the idea of using this for gnarlier future migrations as well, when necessary.

@exekias
Copy link
Member

exekias commented Nov 13, 2024

Thank you for opening this one @ericwooley!

We recently discussed this, and this is a possible approach we could take:

  • Migration files must be in a well-known (configurable) folder, ie pgroll
  • Migration files should be sorted by name, ie by leveraging timestamps (as you do), or numbering (001_, etc)
  • We introduce a pgroll migrate subcommand, that will apply all pending migrations to the database, and leave the last one started (without complete). This command will auto-complete any previous migration until it reaches the last one in the pgroll folder. You would run this command from your CI to ensure the DB has all the migrations in place, with the last one started, so previous version is still available for previous application instances.
  • We introduce a pgroll latest_version_schema (better name pending?) to ease the task of configuring clients. This command will output the schema to be used in the search_path to access the latest version (using local files in pgroll folder as the source of truth).

I believe these 2 new commands would ease local development and continuous deployment.

Please let us know what you think, we are really interested in your feedback!:

@andrew-farries andrew-farries added this to the v1 milestone Nov 13, 2024
@andrew-farries andrew-farries self-assigned this Nov 13, 2024
@ericwooley
Copy link
Author

@exekias

Those sound like great improvements. I believe they would relieve the issues I'm running into.

Thanks for getting back!

Do you want me to leave the issue open? Looks like it's assigned and part of a milestone

@andrew-farries
Copy link
Collaborator

Yes, let's leave this issue open to add the migrate and latest_version_schema (or whatever we end up calling them) commands.

andrew-farries added a commit that referenced this issue Nov 18, 2024
Add a `pgroll migrate` subcommand.

## Documentation
`pgroll migrate` applies all outstanding migrations from a source
directory to the target database.

Assuming that migrations up to and including migration
`40_create_enum_type` from the [example migrations
directory](https://github.com/xataio/pgroll/tree/main/examples) have
been applied, running:

```
$ pgroll migrate examples/
```

will apply migrations from `41_add_enum_column` onwards to the target
database.

If the `--complete` flag is passed to `pgroll migrate` the final
migration to be applied will be completed. Otherwise the final migration
will be left active (started but not completed).

## Notes:
* If no migrations have yet been applied to the target database,
`migrate` applies all of the migrations in the source directory.
* This PR removes the `pgroll bootstrap` command
(#414) as it is equivalent to
running `pgroll migrate <directory> --complete` against a fresh
database.

Part of #446
andrew-farries added a commit that referenced this issue Nov 25, 2024
Add a `pgroll latest` command to show the latest version in either the
target database or a local directory of migration files.

## Documentation

`pgroll latest` prints the latest schema version in either the target
database or a local directory of migration files.

By default, `pgroll latest` prints the latest version in the target
database. Use the `--local` flag to print the latest version in a local
directory of migration files instead.

In both cases, the `--with-schema` flag can be used to prefix the latest
version with the schema name.

#### Database

Assuming that the [example
migrations](https://github.com/xataio/pgroll/tree/main/examples) have
been applied to the `public` schema in the target database, running:

```
$ pgroll latest 
```

will print the latest version in the target database:

```
45_add_table_check_constraint
```

The exact output will vary as the `examples/` directory is updated.

#### Local

Assuming that the [example
migrations](https://github.com/xataio/pgroll/tree/main/examples) are on
disk in a directory called `examples`, running:

```
$ pgroll latest --local examples/
```

will print the latest migration in the directory:

```
45_add_table_check_constraint
```

The exact output will vary as the `examples/` directory is updated.

---
Part of #446
@wtaylor
Copy link

wtaylor commented Dec 7, 2024

Hey guys, quick question on this for @andrew-farries and @exekias. I want to integrate pgroll into my project, migrate and latest look almost perfect for my use case but I just want to confirm some behaviour.

Given the migration sequence of m0 -> m1 -> m2 -> m3 and my running application is currently at m1: when running pgroll migrate (without --complete), will it be the initial schema (m1) and the final schema (m3) remaining or will it be as specified in your initial response under bullet 3: "This command will auto-complete any previous migration until it reaches the last one in the pgroll folder" meaning m2 and m3 available?

If the latter, could I suggest a change in behaviour to the former if possible to keeping m1 and m3 available. I believe this would follow the aims of the project better in allowing apps to continue running while rolling out updates to migrations.

@andrew-farries
Copy link
Collaborator

Hi @wtaylor 👋

Given the migration sequence of m0 -> m1 -> m2 -> m3 and my running application is currently at m1: when running pgroll migrate (without --complete), will it be the initial schema (m1) and the final schema (m3) remaining or will it be as specified in your initial response under bullet 3: "This command will auto-complete any previous migration until it reaches the last one in the pgroll folder" meaning m2 and m3 available?

In this case it will be m2 and m3 that are available.

If the latter, could I suggest a change in behaviour to the former if possible to keeping m1 and m3 available. I believe this would follow the aims of the project better in allowing apps to continue running while rolling out updates to migrations.

pgroll currently requires that the previous migration is completed before the next migration can be started. In our experience this reflected how engineering teams most commonly apply migrations, but we've heard enough feedback since our initial release to suggest that allowing multiple migrations to be active simultaneously is desirable behaviour for pgroll.

Multiple active migrations would allow us to accommodate the scenario you describe with m1 and m3 (and possibly m2 aswell) being available after a migrate. #249 is the issue that tracks this.

@andrew-farries
Copy link
Collaborator

@ericwooley we recently released v0.8 of pgroll which includes migrate and latest commands.

It would good to hear how far these additions go towards solving your problems integrating pgroll into your projects.

@wtaylor
Copy link

wtaylor commented Dec 12, 2024

@andrew-farries Tried out 0.8 last night, great set of QOL features!

I'll gist it when I'm done, but I'm putting together a small python wrapper script to get the behaviour I'm looking for. The quick version is:

  1. Query current schema version
  2. Find unapplied migrations by naively listing the migrations folder alphanumerically
  3. Combine unapplied migrations into a single "compound" migration file by concatenating the operations together
  4. pgroll start compound_migration.json

I couldn't find anything in the docs on operations specifically, am I right in assuming they're applied in sequential order?

@andrew-farries
Copy link
Collaborator

Yes, the operations in a migration are applied in the order in which they appear in the migration.

Squashing multiple migrations into a single migration like this would be useful behaviour to build into pgroll itself.

In order for this to work reliably though we first need to address #239, which limits how useful multi-operation migrations are in practice.

@wtaylor
Copy link

wtaylor commented Dec 18, 2024

Yup, ran into #239 pretty quickly! 🥲

Anyhow, I created a wrapper script, I'm awful at Python but it seems to work if this is useful to anyone else: Gist

The purpose of the script is to be deployed before the app code then:

  1. Run pgroll init only if the db is uninitialised
  2. Apply migrations:
    a. If no previous migrations have been applied (post init), use pgroll migrate
    b. If a single migration is outstanding, apply that with pgroll start
    c. If multiple migrations are outstanding, create a single "compound" migration combining all the operations of all outstanding migrations.

The script is just lacking a --complete flag for post rollout of the app, I'll add that in the near future. My use case is to run it as a K8s job before and after the rollout of a new version of the app code.

The script will get a lot more useful when #239 get's a bit more finished. On the other hand, if the joining like behaviour of "keep the previous and latest schema versions available" get's added into pgroll migrate then this script will be pretty much obsolete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants