Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Allow for updating multiple workloads to arbitrary images in a single release #1140

Closed
jpellizzari opened this issue Jun 12, 2018 · 21 comments
Closed
Assignees

Comments

@jpellizzari
Copy link
Contributor

jpellizzari commented Jun 12, 2018

Currently, there isn't a way to set the image for multiple workloads within a single release (commit). The /update-images endpoint only allows for multiple workloads which reference the same container image to be updated simultaneously.

We need a way to update the images for different workloads at the same time. Example state-change (pseudo JSON):

{
  "default:deployment/ui-server": {
    "Containers": [
      {
        "Name": "ui-server",
        "ID": "quay.io/myorg/ui-server:xyz789" => "quay.io/myorg/ui-server:tuv456"
      }
    ]
  },
  "default:deployment/auth": {
    "Containers": [
      {
        "Name": "auth",
        "ID": "quay.io/myorg/auth:abc123" => "quay.io/myorg/auth:def456"
      }
    ]
  }
}

^^ This "transaction" should be completed within the same release.

@squaremo
Copy link
Member

You may also want the release to be all or nothing -- that is, if any of the individual updates fail, it should fail the whole lot. (Or at least for this to be one of the options).

@rade
Copy link
Contributor

rade commented Jun 12, 2018

You may also want the release to be all or nothing

That seems quite an orthogonal concern, and equally applies to all the existing release operations that apply to more than one workload.

@squaremo
Copy link
Member

squaremo commented Jun 13, 2018

You may also want the release to be all or nothing

That seems quite an orthogonal concern, and equally applies to all the existing release operations that apply to more than one workload.

Kind of. Because the existing modes of release are not exact, they are implicitly best-effort, and it is excusable to do "everything that can be done". If you provide an exact specification, the expectation is that it does exactly what you said. So while it's something you can treat as orthogonal, there's a pretty strong link between the mode and the transactional properties.

If exact, transactional releases are supported, you can use them to make the other kinds transactional too -- do a dry-run, turn that into an exact spec, then run that.

@rade
Copy link
Contributor

rade commented Jun 13, 2018

If you provide an exact specification, the expectation is that it does exactly what you said
[...]
do a dry-run, turn that into an exact spec, then run that.

In Weave Cloud we already do a dry-run first for everything, which conveys a very strong sense of exactness.

If exact, transactional releases are supported

Being able to release exact versions is useful w/o transactionality. Indeed it many cases it is desirable, i.e. often I would not want everything rolled back when the release of one of the components failed.

@squaremo
Copy link
Member

Indeed it many cases it is desirable, i.e. often I would not want everything rolled back when the release of one of the components failed.

I don't think anyone is suggesting that.

In Weave Cloud we already do a dry-run first for everything, which conveys a very strong sense of exactness.

.. without actually delivering it (when you execute the release, it recalculates from scratch, and something may have changed in the meantime).

@rade
Copy link
Contributor

rade commented Jun 13, 2018

without actually delivering it

Indeed. And being able to specify exact versions would cure that, which I think is what this issue is about. However,

be all or nothing

is not what this issue is about.

@squaremo
Copy link
Member

squaremo commented Jun 13, 2018

is not what this issue is about.

I don't understand what you're objecting to. If you add a new kind of release, you have to decide when and how it fails.

I'm suggesting that you would want the system to fail if any of the individual updates can't be applied (and to be clear, I mean a specified change can't be applied to a file in the git repo), to avoid the situation in which a user says "please make exactly these changes" and the system responds "great news! I did approximately what you said".

This scenario might come about if, for example, your information is stale and something's been updated in the meantime. It's important the system try and avoid it because you might make a different decision in light of the new information -- say, if the release you were going to do is actually a downgrade now.

@rade
Copy link
Contributor

rade commented Jun 13, 2018

I don't understand what you're objecting to

I am objecting to the "all or nothing". However, ...

I'm suggesting that you would want the system to fail if any of the individual updates can't be applied (and to be clear, I mean a specified change can't be applied to a file in the git repo)

Ah, I didn't realise we were only talking about the git commit phase here. Given that we'd want all the changes to be applied in a single commit, presumably the entire commit (or, rather, the subsequent push) would fail if something has changed underneath. That is fine by me.

@squaremo
Copy link
Member

presumably the entire commit (or, rather, the subsequent push) would fail if something has changed underneath.

I wouldn't rely on the push failing; you can have stale information but still end up operating on the HEAD.

@rade
Copy link
Contributor

rade commented Jun 13, 2018

you can have stale information but still end up operating on the HEAD.

In that when the operation is performed flux fetches the current HEAD, which may have moved since the version info was presented to the user?

I see what you are getting at now, I think: the api needs to specify both the current and desired version, so that flux can detect whether the current version has moved, and not make the change in that case. Correct? Kinda awkward for the CLI though, e.g. requiring the user to specify the current version for everything.

The question then is whether when flux detects such a change for some workloads, it should still proceed to update the other workloads. I think it should.

@squaremo
Copy link
Member

In that when the operation is performed flux fetches the current HEAD, which may have moved since the version info was presented to the user?

Not quite (it doesn't fetch before attempting an update), but the conclusion is the same -- the HEAD may have moved since it did whatever calculation. You can paper over it by trying to close the gap between what's shown and what's in git, but it's inherently asynchronous and the only foolproof defence is to provide preconditions and fail if they are not met.

The question then is whether when flux detects such a change for some workloads, it should still proceed to update the other workloads. I think it should.

I'll go back to my original rationale, now that you've caught up :-) If you provide an exact specification, the expectation is that it does exactly what you said. This is especially clear in the way Weave Cloud's UI works: by showing the result of a dry run, and asking you to confirm, it sets up the expectation that what the button will do is exactly what's shown -- not some of the controllers, and not approximately the updates you wanted.

This is a bit more arguable than failing due to preconditions. You might find that, for example, releases can hardly ever succeed, because something always changes before you hit the button. In that case it would be useful to be able to relax the all-or-nothingness (or, if you're OK with things not all happening at once, just update fewer things at a time!)

@rade
Copy link
Contributor

rade commented Jun 13, 2018

I disagree with your assessment that the Weave Cloud UI sets an all-or-nothing expectations. I see it as a list of things to attempt, some of which may fail.

btw, I am curious how you expect the CLI to work for this - requiring the user to specify all current versions whenever they want to release something would be rather onerous.

@squaremo
Copy link
Member

I am curious how you expect the CLI to work for this

There's two ways I'd use this in the CLI:

  • in an interactive confirmation step for the existing release command; i.e., you would ask fluxctl release --all --update-image foo and see what it says, then confirm that's what you want. Alternatively, you could save the dry-run plan, edit it, and feed it back.
  • to perform individual, precision updates -- but I'd probably have to tweak it, or work around the preconditions by fetching the current images first, or something along those lines.

@rndstr
Copy link
Contributor

rndstr commented Jun 15, 2018

I'm thinking something like this:

POST /v9/update-manifests
{
    "type": "containers",
    "cause": {"Message": "commit message", "User": "Jane Doe <[email protected]>"},
    "spec": {
        "ContainerSpecs": {
            "default:deployment/ui-server": {
                "ui-server": "quay.io/myorg/ui-server:tuv456"
            }
            "default:deployment/auth": {
                "auth": "quay.io/myorg/auth:def456",
                "logging": "quay.io/myorg/logging:master-rst456"
            }
        }
        "Kind": "plan"
    }
}

@squaremo
Copy link
Member

@rndstr I reckon include preconditions (i.e., what the current image should be), so that it can bail if things aren't as they were, when shown to the user.

@rndstr
Copy link
Contributor

rndstr commented Jun 15, 2018

More like this then

POST /v9/update-manifests
{
    "type": "containers",
    "cause": {"Message": "commit message", "User": "Jane Doe <[email protected]>"},
    "spec": {
        "ContainerSpecs": 
            "default:deployment/ui-server": {
                "ui-server": {
                    "Current": "quay.io/myorg/ui-server:1.0",
                    "Target": "quay.io/myorg/ui-server:1.1"
                }
            },
            "default:deployment/auth": {
                "auth": {
                    "Current": "quay.io/myorg/auth:abc123",
                    "Target": "quay.io/myorg/auth:def456",
                },
                "logging": {
                    "Current": "quay.io/myorg/logging:rst123"
                    "Target": "quay.io/myorg/logging:uvw456"
                }
            ]
        }
        "Kind": "plan",
    }
}

@squaremo
Copy link
Member

squaremo commented Jun 21, 2018

@rndstr Yeah that looks good. I wonder about map[container]struct{Current, Target string} vs []struct{Name, Current, Target string}. Maybe ordering will matter at some point? A minor thing.

@jml
Copy link

jml commented Jul 9, 2018

What's currently happening with this issue?

@rndstr
Copy link
Contributor

rndstr commented Jul 9, 2018

@jml I just opened a PR (#1218)

@MilanDasek
Copy link

@squaremo

Hi,

I have tried

curl -XPOST http://flux.flux.svc.cluster.local:3030/api/flux/v9/update-manifests

{
  "type": "containers",
  "spec": {
    "ContainerSpecs": {
      "%namespace%:deployment/%NAME%": [
        {
          "Container": "%NAME%",
	  "Current": "xxx.dkr.ecr.eu-west-1.amazonaws.com/%NAME%:1.0.0.134",
          "Target": "xxx.dkr.ecr.eu-west-1.amazonaws.com/%NAME%:latest"
        }
      ],
      "%namespace%:deployment/%NAME%": [
        {
          "Container": "%NAME%",
          "Current": "xxx.dkr.ecr.eu-west-1.amazonaws.com/%NAME%:1.0.0.332",
          "Target": "xxx.dkr.ecr.eu-west-1.amazonaws.com%NAME%:1.0.0.331"
        }
      ]
    },
    "Kind": "plan",
    "SkipMismatches": false
  }
}

but I get only
422 - no changes found

any idea?

thanks

M

@rndstr
Copy link
Contributor

rndstr commented Mar 18, 2019

@MilanDasek that means that your containers are already at the target image. If that doesn't solve your issue, could you open a GH issue for your problem? (and also post the relevant sections of GET http://flux.flux.svc.cluster.local:3030/api/flux/v6/services)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants