Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remove deprecated dbinit docs #112

Merged
merged 8 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,9 @@ on:
branches:
- "main"
- "build-core-image"
paths-ignore:
- '**/*.md' # Ignore all markdown files
- 'user-documentation/'
- 'documentation/'
pull_request:
branches:
- "main"
paths-ignore:
- '**/*.md' # Ignore all markdown files
- 'user-documentation/'
- 'documentation/'

permissions:
contents: read
Expand Down
103 changes: 55 additions & 48 deletions documentation/core-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,50 +53,78 @@ Credentials to access the database should be stored in a Kubernetes secret, and

As a quick start, a PostgreSQL container can be started as part of Yaku deployment. When starting to use Yaku in production, you need to connect it to a production-ready PostgreSQL database.

### DB Schema Management
### Database Migrations

The API service is implemented using TypeORM as connection technology to manage the database. Currently, the automatic schema management from TypeORM is used to create and migrate the database schema over time. As a consequence, in a situation that a new database needs to be used, the service has to be run once to create the tables and constraints needed in the database. There is no alternative way to create the database, because TypeORM is very picky about the database and does not compare on a structural basis but insists on creating and maintaining the database on its own. Even a backup and restore into another database is not accepted by TypeORM. Therefore, creating an empty database must be done using the service.
Database migrations are used to manage changes/versions of the database schema over time. Each migration is a new version of the database schema, and is a set of instructions that describe how to update the schema from the previous version to the new version. Having a set of migrations allows to easily update the database schema as your application evolves, thus providing an incremental and versioned path towards building the current state of a database.

## Create an Admin User
### What are migrations?

**Note**: This will be soon deprecated. We recommend that you use Keycloak for user management.
Effectively, migrations are simply files which contain queries that send a database from a state to another. They follow a MigrationInterface which has an up and down function. The up function contains queries that move the database from an old state to a new one, while the down function does the opposite.

The current implementation of the service requires an admin user and a token of this user to bootstrap the service. This is due to the fact that users, tokens, and namespaces can only be created by an admin user and they have to exist prior to any meaningful usage of the service.
**Migrations help with transitions and tracking:**

In order to bootstrap this, we created a tool called `qg-dbinit`, that creates the required database commands to prepare the database with an admin user and token.
- Transitions help move the database from a state to a different new desired state.
- Tracking is achieved by storing each transition in migration files which can be committed to a repository, keeping a linear history of the changes.

`qg-dbinit` is a small golang tool, that can be run on any machine with a golang development system in place.
**Migration types:**

### How to use `qg-dbinit`
- Schema migrations - create, modify, delete columns, indexes, constraints, etc.
- Data migrations - populate or modify data in the database

Export the environment variable `JWTKEY`. The value of this variable must match the value of `JWT_SECRET_KEY` variable used in the service. Run the tool locally or run the docker image:
### How to perform a Yaku database migration

```bash
docker run -e JWTKEY=<JWT_SECRET_KEY> growpatcr.azurecr.io/qg-dbinit:1.0.0
```
As a Yaku developer, you have access to multiple migrations related actions which help you migrate the DB:

- create

Creates an empty migration template. Here, any query can be manually added to be used by the query runner.
Generally used for data migrations. For example, in this case, the developer can choose the actions that should be taken when migrating from a float value to an integer value. (approximation, truncation, etc.)
Run this with: `npm run migrations:create`

- generate

Generates a new migration file by comparing the contents of the database with the entities tracked by Typeorm. If any discrepancy between the entities and the database is found, a migration with the proper queries which can be run to get rid of it will be generated.
Updates the history table of the database
Run this with: `npm run migrations:generate``

- run

Looks inside the migrations directory and runs all migrations previously generated (only those that are not found in the history table in the DB). (Specifically, it runs the up functions of each migration)
Updates the database and the history table of the database. (The history table is a table which keeps track of all the migrations previously run. Helps typeorm find the differences between the actual database and the entities)
Can be simulated with the --fake flag
Run this with: `npm run migrations:run``

- revert

Reverts the last run. (Specifically runs the down function of the last migration)
Can be simulated with the --fake flag
Run this with: `npm run migrations:revert``

The output should contain the following:
So, to sum it up, the developer who adds changes has to:

1. An insert statement that creates the admin user in the corresponding database table. It looks like:
```bash
insert into "user" (username, roles) values ('admin', 'admin')
```
- Pull the latest stable branch and run the migrations from there. This makes the migrations be up to date with the stable state of the database.
- Add their changes to the code
- Generate and run a new migration (Alternatively, create a manual data migration) and push the changes, including the newly generated migration file.

1. An insert statement that creates an access token entry for the created admin user in the corresponding database table. It looks like:
### How do migrations work

```bash
insert into api_token_metadata ("tokenId", "userId") values ('$2a$05$QE.n8aZbcmDxdqdeDiUZ6uvVCzOHogCW2m42.3/v86IdNQP/7eB.q', <1>)
```
Typeorm has the possibility to automatically run migrations that have been previously generated and pushed, given the migrationsRun configuration variable is set. This is enabled through the `POSTGRES_MIGRATIONSRUN` environment variable set to True in Yaku api deployment.

Be aware that the second parameter '<1>' is a placeholder for the unique id of the user which is returned by the first insert statement. Replace it prior to executing the statement.
### Rules to follow for smooth operations

1. The token for the admin user. It must not be stored in the database but managed independent of the service in the key vault of your choice. It looks like:
- Baseline migrations:

```bash
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MTE0NjU2MTcsImlhdCI6MTY3OTkyOTYxNywidG9rZW5JZCI6IjAxOTZkY2Q4LTYxZjQtNDE5Ny05NjI2LWI5ZDliNTBmYzgxYSIsInVzZXJuYW1lIjoiYWRtaW4ifQ.WEZieyX15j8_FlsY3JvxzZRO-p-92CBSIS8pZiCK7uY
```
Generally, projects are started with migrations in mind.
There might be a chance that an already established database wants to use migrations. For such cases, developers could fall in a situation in which they want to try things locally, but the only migrations available are the ones added after the migration usage began. As a result, the generated database will be missing the big chunk of information that was there previously.
A baseline migration should be generated to keep track of the previous database schema. This can be done by running generate on the project before any migrations are added. However, in order to ensure that this works in both cases (when the database exists and when it doesn't), additional checks should be added to see if the baseline databases should be added or not.

You have to execute the two insert statements using a DB client, e.g., psql. After the two rows are entered into the database, it is possible to start using the service via the REST API.
- Don't rewrite history:

Migrations should be considered as immutable once they are pushed to the repository. The only way of changing an already pushed migration should be by pushing a new migration or by reverting a new migration. This condition works similarly to how you wouldn't cut a commit laying at the middle of the commit history.

- Follow the main branch:

Migrations should be generated based on the newest main branch. Starting migrations from any other branch would create discrepancies between migrations, leading to history rewriting, inter-dimensional time-loops and black holes.

## Admin Endpoints

Expand All @@ -106,24 +134,3 @@ You have to execute the two insert statements using a DB client, e.g., psql. Aft
| Create Namespace | POST | `/api/v1/namespaces` | `{"name": "my-namespace", "users": [{"username": "xxx"},{"username": "yyy"}]}` |
| Get Users | GET | `/api/v1/users` | |
| Create User | POST | `/api/v1/users` | `{"username": "xxx"}` |
| Get Users Tokens | GET | `/api/v1/users/users/api-tokens` | |
| Create User Token| POST | `/api/v1/users/users/api-tokens` | `{"username": "xxx"}` |

## Database Migrations

As described before in Schema Management by the Service, the database is managed by TypeORM, i.e., changes in the schema are automatically executed in the database when the service starts with a new version. This imposes some risks:

- When a property is removed, the corresponding column is removed with all data in it. If the change is done by accident, data will be lost.
- When a new property is added which might not be NULL, the database gets into an inconsistent state because all old rows will not have a value which violates the NON-NULL constraint.
- When a property is changed, data might get lost if the renaming is not done properly.

In order to ensure safe data management, please follow these patterns:

- Do not remove properties in the service entities without a 4-eyes principle (a review is not enough, discuss the changes with someone else).
- When adding a new property, do it in three steps (which require each step to be deployed as its own version):
1. Introduce the property as a nullable property.
2. Run a migration script that fills the column for all rows with a default value.
3. Change the property to prevent nullable values.
- Change property names only by adding the new name and follow similar three steps, by adding the property, migrating by copying the data, and removing the old property in a third step.

The pattern for data migrations involves using an `onApplicationBootstrap` event action, as detailed in the NestJS documentation on lifecycle events: [Lifecycle Events in NestJS](https://docs.nestjs.com/fundamentals/lifecycle-events). This action is initiated once the service launches but subsequent to the database connection. It facilitates the execution of necessary data migrations for altered table schemas. Execution of this action is synchronous, meaning the service will pause until the action completes, ensuring no access to the service API is permitted in the interim. This action must be incorporated before the service deployment to implement step 2 of the outlined migration pattern and should be removed for the subsequent step 3 deployment. Practical experience indicates that both steps 1 and 2 of the migration pattern can be consolidated into a single deployment phase. Consequently, TypeORM will enact schema modifications immediately upon service initialization, followed by the migration action, which applies to the newly adjusted schema.
19 changes: 1 addition & 18 deletions qg-api-service/qg-api-service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,7 @@ SPDX-License-Identifier: MIT
- Run `npm start -w qg-api-service` to start the service
- Access the api description at <http://localhost:3000/docs>

### Prepare Database

#### SQLite

First install a sqlite cli e.g. `brew install sqlite3` and then use this [repo](https://github.com/B-S-F/qg-dbinit) to generate two SQL insert statements. With those statements you can create an admin user in the database (you need to provide the JWT secret key as an environment variable to the tool, this can be found in the [config.ts](./src/config.ts) file) The tool will then print the SQL statements
that create an admin user token. You can then use the sqlite cli to insert the user and the token into the database, similar to this:

```bash
sqlite3 <path-to-your-sqlite-file> 'insert into user (username, roles) values ("admin", "admin")'
sqlite3 <path-to-your-sqlite-file>'insert into api_token_metadata ("tokenId", "userId") values ("$2a$05$zzoHodGFmGguogUC1Us1peDh6BMz2QXxyEYIBoEiCIjbiLPam8fPu", 1)'
```

The admin token will also be printed to the console by the [dbinit tool](https://github.com/B-S-F/qg-dbinit) and can be used to add users, tokens and namespaces.

### Postgres
### Prepare Postgres Database

To use postgres locally, the following prerequisites need to be fulfilled:

Expand All @@ -52,9 +38,6 @@ To use postgres locally, the following prerequisites need to be fulfilled:
- DB_PASSWORD to the password you defined for the user in Postgres
- DB_NAME if you differ from 'yaku'

After these configurations have been done, start the service. Run the tooling from https://github.com/B-S-F/qg-dbinit as mentioned in [Postgres](#postgres) and insert the generated SQL statements into your database.
The token created by the mentioned tooling can be used to add users, tokens and namespaces to get started.

### Create users, tokens and namespaces

Have a look at the [scripts](../scripts/create-users-ns.sh) to understand how to create users, tokens and namespaces. You can use the admin token that was created in the [Prepare Database](#prepare-database) step to make the requests.
Expand Down
Loading