Skip to content

Commit

Permalink
Wait for cluster conditions on PUT (#28)
Browse files Browse the repository at this point in the history
New Feature!  Added support for waiting for cluster conditions after a
`put` step.  You can now (optionally) configure a `put` to wait for the
cluster to satisfy specific conditions before the step will succeed.

To enable this, simply specify an `await.timeout` param to your `put`
step:
```yaml
  params:
    await:
      timeout: 30
```

The `timeout` is measured in seconds, and must be greater than `0` to
enable waiting.

The conditions to wait for are specified in `await.conditions` as a
list:
```yaml
  params:
    await:
      timeout: 30
      conditions:
        - select(.kind == "Pod") | .status.containerStatuses[] | .ready
```
You can include any number of conditions.  In order for the wait to
succeed each expression must evaluate to at least _one_ `true` result,
and no `false` results.  Any other non `true|false` result is ignored.

If no `conditions` are specified, sensible defaults are used for the
resource types being retrieved.

If the `timeout` is exceeded before all conditions are met, the `put`
will fail.

See the updated `README` for further details.
Closes #19
  • Loading branch information
jgriff authored Sep 28, 2021
1 parent 4a731c0 commit e2f67cf
Show file tree
Hide file tree
Showing 12 changed files with 682 additions and 71 deletions.
152 changes: 152 additions & 0 deletions README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,59 @@ the command based on the `source` configuration.

* `namespace`: _Optional._ Overrides the source configuration's value for this particular `put` step.

* `await`: _Optional._ Configures the `put` step to poll the cluster (after running the `kubectl` command) for resources and await certain conditions before succeeding. Has the following configuration:
** `timeout`: _Required_. Must be a positive integer to enable waiting (anything else disables waiting). Measured in seconds.
** `interval`: _Optional_. Polling interval, measured in seconds (defaults to `3`).
** `resource_types`: _Optional_. Overrides the source config `resources_types` for what to retrieve from the cluster and run through the `conditions`.
** `conditions`: _Optional_. List of zero or more `jq` expressions to evaluate. If none are given, default expressions are inferred based on the `resource_types` being retrieved.

==== Wait Conditions

Wait conditions are expressed as `jq` expressions listed under `await.conditions` in the `put` step `params` (similar to the `filter.jq` list in `source` configuration of the resource).
The conditions are given each resource's root JSON object.

[source,yaml]
----
- put: k8s
params:
kubectl: create deployment my-nginx --image=nginx
await:
timeout: 30 # seconds
resource_types: deployment
conditions:
- select(.spec.replicas > 0) | .status.readyReplicas > 0
----

* Can list zero or more conditions (see defaults below for when none are given).
* Each expression _must_ evaluate to a boolean result (`true` or `false`), all other results are ignored.
* All conditions must produce at least one `true` result, and no `false` results.
* If the `timeout` is reached before the conditions are satisfied, `put` will fail.

IMPORTANT: Be sure to craft your expressions to safely filter out or ignore any resources you don't care about, taking note of the `resource_types` you are querying for. _Any `false` result will in any condition will prevent the wait from succeeding._

===== Default Wait Conditions

If no `conditions` are given, wait will attempt to infer sensible default conditions based on the `resource_types`.
The table below list the conditions that are used by default.

|===
|`resource_types` |Default Condition

| `pod`, `pods`, `po`
| `select(.kind == "Pod") \| .status.containerStatuses[] \| .ready`

| `deployment`, `deployments`, `deploy`
| `select(.kind == "Deployment") \| select(.spec.replicas > 0) \| .spec.replicas == .status.readyReplicas`

| `replicaset`, `replicasets`, `rs`
| `select(.kind == "ReplicaSet") \| select(.spec.replicas > 0) \| .spec.replicas == .status.readyReplicas`

| `statefulset`, `statefulsets`, `sts`
| `select(.kind == "StatefulSet") \| select(.spec.replicas > 0) \| .spec.replicas == .status.readyReplicas`

|===

NOTE: The default source config `resource_types` is `pod`.

== Examples

Expand Down Expand Up @@ -292,7 +345,106 @@ jobs:
namespace: prod
----

=== `put` with `await`

Here's the same example as above, with the added `await` behavior where `put` will wait up to 2 minutes for the deployment to come up.
If the deployment isn't ready after 2 minutes, `put` will fail.

[source,yaml]
----
resource_types:
- name: k8s-resource
type: docker-image
source:
repository: jgriff/k8s-resource
resources:
- name: k8s
type: k8s-resource
icon: kubernetes
source:
url: ((k8s-server))
token: ((k8s-token))
certificate_authority: ((k8s-ca))
jobs:
- name: deploy-prod
plan:
- get: my-k8s-repo
trigger: true
- put: k8s
params:
kubectl: apply -f my-k8s-repo/deploy.yaml
namespace: prod
await:
timeout: 120
resource_types: deployment
----

=== `put` with `await` leveraging check filters

Since `await` uses `check` to retrieve the resources, all the `source.filter` options are available to you when querying for resources to check against your conditions.

For example:

[source,yaml]
----
resources:
- name: k8s
type: k8s-resource
icon: kubernetes
source:
url: ((k8s-server))
token: ((k8s-token))
certificate_authority: ((k8s-ca))
namespace: prod
resource_types: deployment
filter:
selector: app=my-app,app.kubernetes.io/component in (frontend, backend)
name: "my-*"
jobs:
- name: deploy-prod
plan:
- get: my-k8s-repo
trigger: true
- put: k8s
params:
kubectl: apply -f my-k8s-repo/deploy.yaml
await:
timeout: 120
----

This will:

. Apply our deployment from `deploy.yaml`.
. Then wait at most 2 minutes for all deployments to reach a ready state (default condition for `deployment` resource types) whose:
* name starts with `"my-"`.
* have a metadata label `"app.kubernetes.io/component"` of either `"frontend"` or `"backend"`.


=== `put` with `await` custom conditions

You can supply any custom condition to `await` on.

[source,yaml]
----
jobs:
- name: deploy-prod
plan:
- get: my-k8s-repo
trigger: true
- put: k8s
params:
kubectl: apply -f my-k8s-repo/deploy.yaml
namespace: prod
await:
timeout: 120
resource_types: deployment,statefulset
conditions:
- select(.metadata.name == "my-deployment") | .status.readyReplicas > 0
- select(.metadata.name == "my-statefulset") | .status.readyReplicas > 0
----

=== `get` and `put` Resources

Expand Down
160 changes: 160 additions & 0 deletions assets/await
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
#!/bin/bash

# -------------------------------------------------------------------------------------
# await functions - expects 'common' to be already sourced

await() {
# await is only enabled if the timeout is configured
local await_enabled=$(jq -r '.params.await.timeout > 0' < $payload)

if isTrue await_enabled; then
local await_timeout=$(jq -r '.params.await.timeout' < $payload)
local await_interval=$(jq -r '.params.await.interval | select(.!=null)' < $payload)

# potentially override 'resource_types'
local override_resource_types=$(jq -r '.params.await.resource_types | select(.!=null)' < $payload)
if isSet override_resource_types; then
source_resource_types=$override_resource_types
fi

log -p "\n--> Waiting up to ${yellow}${await_timeout}${reset} seconds for the following condition(s) to be true for resource type(s): ${yellow}$source_resource_types${reset}$(jq --arg nl "\n" --arg color $cyan --arg reset $reset -r '.[] | $nl + "- " + $color + . + $reset' <<< "$(getAwaitConditions)")${reset}"

awaitLoop $await_timeout $await_interval
fi
}

awaitLoop() {
local await_timeout=$1
local await_interval=${2:-3}

local await_started_at=$(date +%s)
local await_timeout_at=$((await_started_at + await_timeout))

local tmp_await_status=$(mktemp)
echo "False" > $tmp_await_status

await_attempts=1
until checkAwaitConditions $await_attempts && echo "True" > $tmp_await_status || [ $(date +%s) -gt $await_timeout_at ]
do
local await_time_left=$(($await_timeout_at - $(date +%s)))
local await_time_left_COLOR
if [[ $await_time_left -gt 120 ]]; then
await_time_left_COLOR=${green}
elif [[ $await_time_left -gt 30 ]]; then
await_time_left_COLOR=${yellow}
else
await_time_left_COLOR=${red}
fi

log -p "[$await_attempts]⏳Awaiting cluster conditions, ${await_time_left_COLOR}$(( $await_time_left / 60))m$(( $await_time_left % 60))s${reset} left before giving up..."
sleep $await_interval
(( await_attempts++ ))
done

local await_elapsed=$(( $(date +%s) - await_started_at))
local await_duration_summary="${yellow}$await_attempts${reset} attempt(s) taking ${yellow}$(( $await_elapsed / 60))m$(( $await_elapsed % 60))s${reset}"
if [[ $(cat $tmp_await_status) == "True" ]]
then
log -p "\n${green}Success!${reset} All conditions satisfied after $await_duration_summary!"
else
log -p "\n${red}Timeout exceeded.${reset} Failed to meet condition(s) after $await_duration_summary."
exit 111
fi
}

checkAwaitConditions() {
# query for current cluster status
local attempt=${1:-1}
if [[ $attempt = 1 ]]; then
# execute the first check without redirecting output to /dev/null so user can see the query filters being used
queryForVersions
else
queryForVersions &> /dev/null
fi

# check each condition individually
local tmp_conditions_results=$(mktemp)
jq -r '.[]' <<< "$(getAwaitConditions | tr '\r\n' ' ')" | while read -r condition; do
# evaluate this condition, collecting an array of the results
local results=($(jq -r ".[] | $condition | select(. == true or . == false)" <<< "$new_versions" | jq -s '.' | jq -r 'map(tostring) | join(" ")'))

# count up number of true/false matches
local trueCount=0
local falseCount=0
for result in "${results[@]}"; do
if isTrue result; then
((trueCount++))
else
((falseCount++))
fi
done

# color code the printout of the counts
local trueCountColor=${red}
local falseCountColor=${green}
if [ $trueCount -gt 0 ]; then
trueCountColor=$green
fi
if [ $falseCount -gt 0 ]; then
falseCountColor=$red
fi

# assert the results from this condition have:
# - at least one 'true' value
# - no 'false' values
# ✘ [true: 0, false: 5]: <condition>
# ✘ [true: 2, false: 3]: <condition>
# ✔ [true: 5, false: 0]: <condition>
resultsStatement="[true: ${trueCountColor}$trueCount${reset}, false: ${falseCountColor}$falseCount${reset}]: ${cyan}$condition${reset}"
if [ $trueCount -gt 0 ] && [ $falseCount -eq 0 ]; then
log -p "${green}${reset} ${resultsStatement}"
else
log -p "${red}${reset} ${resultsStatement}"
echo "FAILED" > $tmp_conditions_results
fi
done

# now, assert no condition failed
if cat $tmp_conditions_results | grep -q 'FAILED'; then
return 1
fi
}

getAwaitConditions() {
local conditions=$(jq -r '.params.await.conditions | select(.!=null)' < $payload)

# if no user conditions given, use defaults
if notSet conditions; then
IFS=',' read -ra active_res_types <<< "$source_resource_types"
local default_conditions=()

if containsElement "po" "${active_res_types[@]}" || \
containsElement "pod" "${active_res_types[@]}" || \
containsElement "pods" "${active_res_types[@]}"; then
default_conditions+=('select(.kind == "Pod") | .status.containerStatuses[] | .ready')
fi

if containsElement "deployment" "${active_res_types[@]}" || \
containsElement "deployments" "${active_res_types[@]}" || \
containsElement "deploy" "${active_res_types[@]}"; then
default_conditions+=('select(.kind == "Deployment") | select(.spec.replicas > 0) | .spec.replicas == .status.readyReplicas')
fi

if containsElement "replicaset" "${active_res_types[@]}" || \
containsElement "replicasets" "${active_res_types[@]}" || \
containsElement "rs" "${active_res_types[@]}"; then
default_conditions+=('select(.kind == "ReplicaSet") | select(.spec.replicas > 0) | .spec.replicas == .status.readyReplicas')
fi

if containsElement "statefulset" "${active_res_types[@]}" || \
containsElement "statefulsets" "${active_res_types[@]}" || \
containsElement "sts" "${active_res_types[@]}"; then
default_conditions+=('select(.kind == "StatefulSet") | select(.spec.replicas > 0) | .spec.replicas == .status.readyReplicas')
fi

# collect all of the enabled default conditions
conditions=$(IFS=$'\n'; echo "${default_conditions[*]}" | jq -R . | jq -s '.')
fi

echo "$conditions"
}
Loading

0 comments on commit e2f67cf

Please sign in to comment.