Skip to content

Commit

Permalink
release: add kube-prometheus-stack migration script
Browse files Browse the repository at this point in the history
  • Loading branch information
anders-elastisys committed Jan 28, 2025
1 parent 6633c5d commit e868dec
Show file tree
Hide file tree
Showing 6 changed files with 388 additions and 0 deletions.
170 changes: 170 additions & 0 deletions migration/v0.44/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# Upgrade to v0.44.x

> [!WARNING]
> Upgrade only supported from v0.43.x.
<!--
Notice to developers on writing migration steps:
- Migration steps:
- are written per minor version and placed in a subdirectory of the migration directory with the name `vX.Y/`,
- are written to be idempotent and usable no matter which patch version you are upgrading from and to,
- are documented in this document to be able to run them manually,
- are divided into prepare and apply steps:
- Prepare steps:
- are placed in the `prepare/` directory,
- may **only** modify the configuration of the environment,
- may **not** modify the state of the environment,
- steps are run in order of their names use two digit prefixes.
- Apply steps:
- are placed in the `apply/` directory,
- may **only** modify the state of the environment,
- may **not** modify the configuration of the environment,
- are run in order of their names use two digit prefixes,
- are run with the argument `execute` on upgrade and should return 1 on failure and 2 on successful internal rollback,
- are rerun with the argument `rollback` on execute failure and should return 1 on failure.
For prepare the init step is given.
For apply the bootstrap and the apply steps are given, it is expected that releases upgraded in custom steps are excluded from the apply step.
Upgrades of components that are dependent on each other should be done within the same snippet to easily manage the upgrade to a working state and to be able to rollback to a working state.
Steps should use the `scripts/migration/lib.sh` which will provide helper functions, see the file for available helper functions.
This script expects the `ROOT` environment variable to be set pointing to the root of the repository.
As with all scripts in this repository `CK8S_CONFIG_PATH` is expected to be set.
-->

## Prerequisites

- [ ] Read through the changelog to check if there are any changes you need to be aware of. Read through the release notes, Platform Administrator notices, Application Developer notices, and Security notice.
- [ ] Notify the users (if any) before the upgrade starts;
- [ ] Check if there are any pending changes to the environment;
- [ ] Check the state of the environment, pods, nodes and backup jobs:

```bash
./bin/ck8s test sc|wc
./bin/ck8s ops kubectl sc|wc get pods -A -o custom-columns=NAMESPACE:metadata.namespace,POD:metadata.name,READY-false:status.containerStatuses[*].ready,REASON:status.containerStatuses[*].state.terminated.reason | grep false | grep -v Completed
./bin/ck8s ops kubectl sc|wc get nodes
./bin/ck8s ops kubectl sc|wc get jobs -A
./bin/ck8s ops helm sc|wc list -A --all
velero get backup
```

- [ ] Silence the notifications for the alerts. e.g you can use [alertmanager silences](https://prometheus.io/docs/alerting/latest/alertmanager/#silences);

## Automatic method

1. Pull the latest changes and switch to the correct branch:

```bash
git pull
git switch -d v0.44.x
```

1. Prepare upgrade - _non-disruptive_

> _Done before maintenance window._

```bash
./bin/ck8s upgrade both v0.44 prepare
# check if the netpol IPs need to be updated
./bin/ck8s update-ips both dry-run
# if you agree with the changes apply
./bin/ck8s update-ips both apply
```

> **Note:**
> It is possible to upgrade `wc` and `sc` clusters separately by replacing `both` when running the `upgrade` command, e.g. the following will only upgrade the workload cluster:

```bash
./bin/ck8s upgrade wc v0.44 prepare
./bin/ck8s upgrade wc v0.44 apply
```

1. Apply upgrade - _disruptive_

> _Done during maintenance window._

```bash
./bin/ck8s upgrade both v0.44 apply
```

## Manual method

### Prepare upgrade - _non-disruptive_

> _Done before maintenance window._

1. Pull the latest changes and switch to the correct branch:

```bash
git pull
git switch -d v0.44.x
```

1. Set whether or not upgrade should be prepared for `both` clusters or for one of `sc` or `wc`:

```bash
export CK8S_CLUSTER=<wc|sc|both>
```

1. Update apps configuration:

This will take a backup into `backups/` before modifying any files.

```bash
./bin/ck8s init ${CK8S_CLUSTER}
# or
./migration/v0.44/prepare/50-init.sh
# check if the netpol IPs need to be updated
./bin/ck8s update-ips ${CK8S_CLUSTER} dry-run
# if you agree with the changes apply
./bin/ck8s update-ips ${CK8S_CLUSTER} apply
```

### Apply upgrade - _disruptive_

> _Done during maintenance window._

1. Set whether or not upgrade should be applied for `both` clusters or for one of `sc` or `wc`:

```bash
export CK8S_CLUSTER=<wc|sc|both>
```

1. Upgrade kube-prometheus-stack:

```bash
./migration/v0.43/apply/10-kube-prometheus-stack.sh
```

1. Upgrade applications:

```bash
./bin/ck8s apply {sc|wc}
# or
./migration/v0.44/apply/80-apply.sh execute
```

## Postrequisite

- [ ] Check the state of the environment, pods and nodes:

```bash
./bin/ck8s test sc|wc
./bin/ck8s ops kubectl sc|wc get pods -A -o custom-columns=NAMESPACE:metadata.namespace,POD:metadata.name,READY-false:status.containerStatuses[*].ready,REASON:status.containerStatuses[*].state.terminated.reason | grep false | grep -v Completed
./bin/ck8s ops kubectl sc|wc get nodes
./bin/ck8s ops helm sc|wc list -A --all
```

- [ ] Enable the notifications for the alerts;
- [ ] Notify the users (if any) when the upgrade is complete;

> [!NOTE]
> Additionally it is good to check:
>
> - if any alerts generated by the upgrade didn't close;
> - if you can login to Grafana, Opensearch or Harbor;
> - you can see fresh metrics and logs.
75 changes: 75 additions & 0 deletions migration/v0.44/apply/00-template.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#!/usr/bin/env bash

ROOT="$(readlink -f "$(dirname "${0}")/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

# functions currently available in the library:
# - logging:
# - log_info(_no_newline) <message>
# - log_warn(_no_newline) <message>
# - log_error(_no_newline) <message>
# - log_fatal <message> # this will call "exit 1"
#
# - kubectl
# # Use kubectl with kubeconfig set
# - kubectl_do <sc|wc> <kubectl args...>
# # Perform kubectl delete, will not cause errors if the resource is missing
# - kubectl_delete <sc|wc> <resource> <namespace> <name>
#
# - helm
# # Use helm with kubeconfig set
# - helm_do <sc|wc> <helm args...>
# # Checks if a release is installed
# - helm_installed <sc|wc> <namespace> <release>
# # Uninstalls a release if it is installed
# - helm_uninstall <sc|wc> <namespace> <release>
#
# - helmfile
# # Use helmfile with kubeconfig set
# - helmfile_do <sc|wc> <helmfile args...>
# # For selector args all will be prefixed with "-l"
# # List releases matching the selector
# - helmfile_list <sc|wc> <selectors...>
# # Apply releases matching the selector
# - helmfile_apply <sc|wc> <selectors...>
# # Check for changes on releases matching the selector
# - helmfile_change <sc|wc> <selectors...>
# # Destroy releases matching the selector
# - helmfile_destroy <sc|wc> <selectors...>
# # Replaces the releases matching the selector, performing destroy and apply on each release individually
# - helmfile_replace <sc|wc> <selectors...>
# # Upgrades the releases matching the selector, performing automatic rollback on failure set "CK8S_ROLLBACK=false" to disable
# - helmfile_upgrade <sc|wc> <selectors...>

run() {
case "${1:-}" in
execute)
# Note: 00-template.sh will be skipped by the upgrade command
log_info "no operation: this is a template"

if [[ "${CK8S_CLUSTER}" =~ ^(sc|both)$ ]]; then
log_info "operation on service cluster"
fi
if [[ "${CK8S_CLUSTER}" =~ ^(wc|both)$ ]]; then
log_info "operation on workload cluster"
fi
;;
rollback)
log_warn "rollback not implemented"

# if [[ "${CK8S_CLUSTER}" =~ ^(sc|both)$ ]]; then
# log_info "rollback operation on service cluster"
# fi
# if [[ "${CK8S_CLUSTER}" =~ ^(wc|both)$ ]]; then
# log_info "rollback operation on workload cluster"
# fi
;;
*)
log_fatal "usage: \"${0}\" <execute|rollback>"
;;
esac
}

run "${@}"
39 changes: 39 additions & 0 deletions migration/v0.44/apply/10-kube-prometheus-stack.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/usr/bin/env bash

ROOT="$(readlink -f "$(dirname "${0}")/../../../")"

source "${ROOT}/scripts/migration/lib.sh"

run() {
case "${1:-}" in
execute)
chart_version=$(yq4 '.version' "${ROOT}/helmfile.d/upstream/prometheus-community/kube-prometheus-stack/Chart.yaml")
clusters=("${CK8S_CLUSTER}")
if [[ "${CK8S_CLUSTER}" == "both" ]]; then
clusters=("wc" "sc")
fi

for cluster in "${clusters[@]}"; do
current_version=$(helm_do "${cluster}" get metadata -n monitoring kube-prometheus-stack -ojson | jq -r '.version')

log_info " - Checking if kube-promethes-stack CRDs needs to be upgraded"
if [[ "${current_version}" != "${chart_version}" ]]; then
log_info " - Replace kube-prometheus-stack CRDs on ${cluster}"
kubectl_do "${cluster}" apply --server-side --force-conflicts -f "${ROOT}"/helmfile.d/upstream/prometheus-community/kube-prometheus-stack/charts/crds/crds
fi

log_info " - Upgrade kube-prometheus-stack on ${cluster}"
helmfile_upgrade "${cluster}" app=prometheus
done

;;
rollback)
log_warn "rollback not implemented"
;;
*)
log_fatal "usage: \"${0}\" <execute|rollback>"
;;
esac
}

run "${@}"
57 changes: 57 additions & 0 deletions migration/v0.44/apply/80-apply.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/usr/bin/env bash

ROOT="$(readlink -f "$(dirname "${0}")/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

# Add selector filters if covered by other snippets.
# Example: "app!=something"
declare -a skipped
skipped=(
"app!=prometheus"
)
declare -a skipped_sc
skipped_sc=(
)
declare -a skipped_wc
skipped_wc=(
)

run() {
case "${1:-}" in
execute)
local -a filters
local selector

if [[ "${CK8S_CLUSTER}" =~ ^(sc|both)$ ]]; then
filters=("${skipped[@]}" "${skipped_sc[@]}")
selector="${filters[*]:-"app!=null"}"
helmfile_upgrade sc "${selector// /,}"
fi

if [[ "${CK8S_CLUSTER}" =~ ^(wc|both)$ ]]; then
filters=("${skipped[@]}" "${skipped_wc[@]}")
selector="${filters[*]:-"app!=null"}"
helmfile_upgrade wc "${selector// /,}"
fi
;;

rollback)
log_warn "rollback not implemented"

# if [[ "${CK8S_CLUSTER}" =~ ^(sc|both)$ ]]; then
# log_info "rollback operation on service cluster"
# fi
# if [[ "${CK8S_CLUSTER}" =~ ^(wc|both)$ ]]; then
# log_info "rollback operation on workload cluster"
# fi
;;

*)
log_fatal "usage: \"${0}\" <execute|rollback>"
;;
esac
}

run "${@}"
31 changes: 31 additions & 0 deletions migration/v0.44/prepare/00-template.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/usr/bin/env bash

HERE="$(dirname "$(readlink -f "${0}")")"
ROOT="$(readlink -f "${HERE}/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

# functions currently available in the library:
# - logging:
# - log_info(_no_newline) <message>
# - log_warn(_no_newline) <message>
# - log_error(_no_newline) <message>
# - log_fatal <message> # this will call "exit 1"
#
# - yq:
# - yq_null <common|sc|wc> <target>
# - yq_copy <common|sc|wc> <source> <destination>
# - yq_move <common|sc|wc> <source> <destination>
# - yq_remove <common|sc|wc> <target>
# - yq_add <common|sc|wc> <destination> <value>

# Note: 00-template.sh will be skipped by the upgrade command
log_info "no operation: this is a template"

if [[ "${CK8S_CLUSTER}" =~ ^(sc|both)$ ]]; then
log_info "operation on service cluster"
fi
if [[ "${CK8S_CLUSTER}" =~ ^(wc|both)$ ]]; then
log_info "operation on workload cluster"
fi
16 changes: 16 additions & 0 deletions migration/v0.44/prepare/50-init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/usr/bin/env bash

HERE="$(dirname "$(readlink -f "${0}")")"
ROOT="$(readlink -f "${HERE}/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

case "${CK8S_CLUSTER}" in
both | sc | wc)
"${ROOT}/bin/ck8s" init "${CK8S_CLUSTER}"
;;
*)
log_fatal "usage: 50-init.sh <wc|sc|both>"
;;
esac

0 comments on commit e868dec

Please sign in to comment.