Skip to content

Latest commit

 

History

History
321 lines (251 loc) · 8.54 KB

proposal.md

File metadata and controls

321 lines (251 loc) · 8.54 KB

across step

This proposal introduces a mechanism by which a build plan can be executed multiple times with different ((vars)), in service of build matrixes, pipeline matrixes, and - building on ((var)) sources - dynamic variations of both.

Motivation

  • Support dynamic multi-branch workflows: concourse/concourse#1172
  • Support build matrixes and pipeline matrixes.
  • Replace a common, and very flawed, usage pattern of version: every, which has proven to be very complicated to support.

Proposal

The across step modifier is given a list containing vars and associated values to set when executing the step. The step will be executed across all combinations of var values.

Static var values

The set of values can be static, as below:

task: unit
vars: {go_version: ((.:go_version))}
across:
- var: go_version
  values: [1.12, 1.13]

This will run the unit task twice: once with go_version set to 1.12, and again with 1.13.

Dynamic var values from a var source

Rather than static values, var values may be pulled from a var source dynamically:

var_sources:
- name: booklit-prs
  type: github-prs
  config:
    repository: vito/booklit
    access_token: # ...

plan:
- set_pipeline: pr
  instance_vars: {pr_number: ((.:pr.number))}
  across:
  - var: pr
    source: booklit-prs

The above example will run the set_pipeline step across the set of all GitHub PRs, returned through a list operation on the booklit-prs var source.

Running across a matrix of var values

Multiple vars may be listed to form a matrix:

set_pipeline: pr-go
instance_vars:
  pr_number: ((.:pr.number))
  go_version: ((.:go_version))
across:
- var: pr
  source: booklit-prs
- var: go_version
  values: [1.12, 1.13]

This will run 2 * (# of PRs) set_pipeline steps, setting two pipelines per PR: one for Go 1.12, and one for Go 1.13.

Controlling parallelism with max_in_flight

By default, the steps are executed serially to prevent surprising load increases from a dynamic var source suddenly returning a ton of values.

To run steps in parallel, a max_in_flight must be specified as either all or a number - its default is 1. Note: this value is specified on each var, rather than the entire step.

With max_in_flight: all, no limit on parallelism will be enforced. This would be typical for when a small, static set of values is specified, and it would be annoying to keep the number in sync with the set:

task: unit
vars: {go-version: ((.:go-version))}
across:
- var: go-version
  values: [1.12, 1.13]
  max_in_flight: all

With max_in_flight: 3, a maximum of 3 var values would be executed in parallel. This would be typically set for values coming from a var source, which may change at any time, or especially large static values.

set_pipeline: pr
instance_vars: {pr_number: ((.:pr.number))}
across:
- var: pr
  source: booklit-prs
  max_in_flight: 3

When multiple max_in_flight values are configured, they are multiplicative, building on the concurrency of previously listed vars:

set_pipeline: pr
instance_vars:
  pr_number: ((.:pr.number))
  go_version: ((.:go_version))
across:
- var: pr
  source: booklit-prs
  max_in_flight: 3
- var: go_version
  values: [1.12, 1.13]
  max_in_flight: all

This will run 6 set_pipeline steps at a time, focusing on 3 PRs and setting Go 1.12 and Go 1.13 pipelines for each in parallel.

Note that setting a max_in_flight on a single var while leaving the rest as their default (1) effectively sets an overall max-in-flight.

Triggering on changes

With trigger: true configured on a var, the build plan will run on any change to the set of vars - i.e. when a var value is added, removed, or changes.

var_sources:
- name: booklit-prs
  type: github-prs
  config:
    repository: vito/booklit
    access_token: # ...

plan:
- set_pipeline: pr
  instance_vars: {pr_number: ((.:pr.number))}
  across:
  - var: pr
    source: booklit-prs
    trigger: true

Note that this can be applied to either static values: or dynamic vars from source: - both cases just boil down to a comparison against the previous build's set of values.

Modifier syntax precedence

The across step is a modifier, meaning it is attached to another step. Other examples of modifiers are timeout, attempts, ensure, and the on_* family of hooks.

In terms of precedence, across would bind more tightly than ensure and on_* hooks, but less tightly than across and timeout.

ensure and on_* bind to the across step so that they may be run after the full matrix completes.

attempts binds to the inner step because it doesn't seem to make a whole lot of sense to retry the entire matrix because one step failed. The individual steps should be retried instead.

timeout binds to the inner step because otherwise a max_in_flight could cause the timeout to be exceeded before some steps even get a chance to run.

task: unit
timeout: 1h # interrupt the task after 1 hour
attempts: 3 # attempt the task 3 times
across:
- var: go_version
  values: [1.12, 1.13]
on_failure: # do something after all steps complete and at least one failed

To apply ensure and on_* hooks to the nested step, rather than the across step modifier, the do: step may be utilized:

do:
- task: unit
  on_failure: # runs after each individual step completes and fails
across:
- var: go_version
  values: [1.12, 1.13]
on_failure: # runs after all steps complete and at least one failed

This can be rewritten in a slightly more readable syntax by placing the do: below the across::

across:
- var: go_version
  values: [1.12, 1.13]
do:
- task: unit
  on_failure: # runs after each individual step completes and fails
on_failure: # runs after all steps complete and at least one failed

Failing fast

When a step within the matrix fails, the across step will continue to run the remaining steps in the matrix. However, the across step itself will still fail after all steps have completed.

With fail_fast: true applied to the across step, execution will halt on the first failure, and any currently running steps (via max_in_flight) will be interrupted:

task: unit
across:
- var: go_version
  values: [1.12, 1.13]
fail_fast: true

Note: this is the first time a step modifier has had additional sibling fields. In the event of a field conflict, the do: step may be utilized as a work-around.

Var scoping and shadowing

The inner step will be run with a local var scope that inherits from the outer scope and is initialized with the across step's var values.

Given that it runs with its own scope, it follows that the vars in the local scope, along with any vars newly set within the scope, are not accessible outside of the across step.

When a var set by the across step shadows outer vars, a warning will be printed.

Example:

plan:
- get_var: foo
  trigger: true
- run: print
  type: debug
  params: {value: ((.:foo))}
- across:
  - var: foo
    values: [one, two]
  do:
  - run: print
    type: debug
    params: {value: ((.:foo))}
- run: print
  type: debug
  params: {value: ((.:foo))}

Assuming the get_var step produces a value of zero, this build plan should result in the following output:

zero
WARNING: across step shadows local var 'foo'
one
two
zero

Open Questions

  • n/a

New Implications

  • Using across with var sources implies the addition of a list action for listing the vars from a var source. We could build on this to show the list of available vars in the UI, which would really help with troubleshooting credential manager access and knowing what vars are available.

    Obviously we wouldn't want to show credential values, so list should only include safe things like credential paths.

  • Combining the set_pipeline step, ((var)) sources, instanced pipelines, pipeline archiving, and the across step can be used for end-to-end automation of pipelines for branches and pull requests:

    set_pipeline: pr
    instance_vars: {pr_number: ((.:pr.number))}
    across:
    - var: pr
      source: prs
      trigger: true