Skip to content
This repository has been archived by the owner on Jul 22, 2022. It is now read-only.

only allow one Metacontroller instance to be active at a time #66

Merged
merged 4 commits into from
Jul 23, 2018

Conversation

crimsonfaith91
Copy link
Contributor

@crimsonfaith91 crimsonfaith91 commented Jul 17, 2018

Addressing issue #5 partially. In the long run, we still have to resolve leader election issue when the user spawns more than one Metacontroller instance.

Metacontroller uses apps/v1beta1 Deployment under the hood. This PR changes the Metacontroller to use apps/v1beta2 StatefulSet to maintain the invariant of only one Metacontroller instance is active at a time.

@crimsonfaith91 crimsonfaith91 requested a review from enisoc July 17, 2018 01:32
@GoogleCloudPlatform GoogleCloudPlatform deleted a comment from googlebot Jul 17, 2018
@GoogleCloudPlatform GoogleCloudPlatform deleted a comment from googlebot Jul 17, 2018
@crimsonfaith91 crimsonfaith91 changed the title only allow one Metacontroller instance be active at a time only allow one Metacontroller instance to be active at a time Jul 17, 2018
@crimsonfaith91
Copy link
Contributor Author

@enisoc Do we want to use StatefulSet for hooks too? For example, do we need to change this?

@@ -43,8 +43,8 @@ spec:
singular: controllerrevision
kind: ControllerRevision
---
apiVersion: apps/v1beta1
kind: Deployment
apiVersion: apps/v1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use apps/v1beta2 for now so we maintain compatibility with k8s 1.8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted!

Copy link
Contributor Author

@crimsonfaith91 crimsonfaith91 Jul 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enisoc When should we change the version from apps/v1beta2 to apps/v1? When the v1.8 is not supported anymore?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep support for 1.8 until we have a compelling use case for relying on a feature that only exists in 1.9+.

@enisoc
Copy link

enisoc commented Jul 17, 2018

Do we want to use StatefulSet for hooks too?

For hooks, we don't need to guarantee that there isn't more than one, but we do ideally want to ensure that all replicas are the same version at a given moment. One way to do that would be to set hook Deployments to use the Recreate update strategy, but that causes unnecessary downtime. Ironically, the ideal way to update a hook deployment would probably be to use a blue-green strategy with a Service that atomically switches to a fully-deployed set of replicas.

You can file an issue to track this if you like, but I don't think it's a high-priority problem for hooks. There may be some "flip-flopping" during rolling updates of hook code, but it should converge on its own once the rollout completes. Most hooks will probably work fine with just one replica, so they should update quickly.

@crimsonfaith91
Copy link
Contributor Author

crimsonfaith91 commented Jul 17, 2018

@enisoc +1 on using blue-green strategy with the Service to ensure all hook replicas have the same version at one moment, which may become important when there are quite a number of hooks.

As the hook codes converge eventually (i.e., not a high priority issue) and most hooks work fine with 1 replica, I will not file an issue tracking this.

@enisoc
Copy link

enisoc commented Jul 18, 2018

@crimsonfaith91 Since we don't have automatic e2e tests yet, this is a good chance to make sure you can run the tests manually. Once you have a cluster set up, I recommend using the Skaffold flow to run your dev build. Since you just changed the Deployment to a StatefulSet, it's especially important to make sure you don't have any previous version of Metacontroller running; you can delete the metacontroller namespace first to ensure that.

I tried skaffold run locally and found one problem so far:

error: error validating "STDIN": error validating data: ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1beta2.StatefulSetSpec; if you choose to ignore these errors, turn validation off with --validate=false

In this case, I think you can set serviceName: "" since we don't need per-pod DNS entries.

Once your dev build of Metacontroller is running, go into the examples dir and run ./test.sh to manually kick off the smoke tests.

@crimsonfaith91
Copy link
Contributor Author

@enisoc I see. I thought it suffices to get the Travis CI build passed (the build does run unit tests). Thanks for the pointer!

metadata:
name: metacontroller
namespace: metacontroller
spec:
serviceName: ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since args.yaml and image.yaml contain partial objects, used for kustomize overlays, we don't need to mention fields here if the value is the same as the base manifest.

Copy link

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@enisoc enisoc merged commit 2a992e2 into GoogleCloudPlatform:master Jul 23, 2018
@crimsonfaith91 crimsonfaith91 deleted the one-mc branch August 22, 2018 21:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants