Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving leader election inside vault-k8s #271

Merged
merged 24 commits into from
Aug 31, 2021
Merged

Conversation

tvoran
Copy link
Member

@tvoran tvoran commented Jul 9, 2021

This uses operator-sdk's Become() function to determine leadership for certificate generation, instead of relying on a leader-elector container as in #198.

The associated helm chart changes are in hashicorp/vault-helm#568

Depends on #273

tvoran added 3 commits July 6, 2021 14:42
The v1 and v1beta1 AdmissionReview, AdmissionResponse, and
AdmissionRequest are the same, so this bumps those to v1. Checks at
the start of certWatcher() if AdmissionregistrationV1 exists, and
falls back to v1beta1 if it doesn't.
@tvoran tvoran marked this pull request as ready for review July 10, 2021 00:24
tvoran added 2 commits July 20, 2021 00:08
And bail out if Become() fails, to prevent split-brain.
These are in a loop that retries once a second, and the TLS error is
normal before the leader generates the cert Secret.
@tvoran tvoran requested review from tomhjp, benashz and calvn July 20, 2021 07:14
tvoran added 3 commits July 22, 2021 17:09
it needs to be more robust than a few retries
The v1 and v1beta1 AdmissionReview, AdmissionResponse, and
AdmissionRequest are the same, so this bumps those to v1. Checks at
the start of certWatcher() if AdmissionregistrationV1 exists, and
falls back to v1beta1 if it doesn't.
and declare payload as a []byte
@tvoran tvoran changed the base branch from master to VAULT-2403/update-client-go July 27, 2021 05:09
Copy link
Contributor

@tomhjp tomhjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, though one question on resilience to errors.

Comment on lines +61 to +62
- "get"
- "patch"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the whole change set yet, but what uses these permissions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's part of the Become() call; it looks like there's some checking and cleanup around situations where an old leader pod was evicted or the underlying node is NotReady: https://github.com/operator-framework/operator-lib/blob/v0.4.1/leader/leader.go#L166-L189

err := operator_leader.Become(ctx, "vault-k8s-leader")
if err != nil {
logger.Error("trouble becoming leader:", "error", err)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we don't restart the Become attempt, could we end up in a situation where none of the replicas are trying to become the leader, if they all error here? Seems like maybe we should have this error feed back into some sort of retry logic to guard against that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @tomhjp , it would a good idea to handle the error from Become() some how. Perhaps the injector should exit right away in this case, and let the scheduler handle the restart?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added an infinite retry when there's an error in 75258f7.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benashz Hmm, I added retries on error in 75258f7, I wonder if you're looking at an old diff?

Exiting right away is an interesting idea too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this, the more I think it would be better for this to retry a few times, then if Become() is still failing, throw a signal or something that would do a system exit non-zero. Otherwise folks would need to be watching the logs to know there was a problem with the replica.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benashz @tomhjp added retries then exit non-zero in b73527f.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that line of thinking 👍

Base automatically changed from VAULT-2403/update-client-go to master July 27, 2021 21:14
leader/leader.go Outdated
// become the leader when the current leader replica stops running, and
// the Kubernetes garbage collector deletes the vault-k8s-leader
// ConfigMap.
err := operator_leader.Become(ctx, "vault-k8s-leader")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a node calls this with a leader already elected, does this return an error, nil, or blocks until context cancelation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it blocks forever until it can achieve leadership. The context passed in is used for k8s API calls, but I don't see it being checked for cancellation in Become(): https://github.com/operator-framework/operator-lib/blob/v0.4.1/leader/leader.go#L88

@@ -95,7 +67,7 @@ spec:
port: 8080
scheme: HTTPS
failureThreshold: 2
initialDelaySeconds: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change on initial delay for readiness and liveness mean that leader election is a bit longer to establish through this method, or is it to make it less flaky in general?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it was just taking a little longer than a second while I was testing it locally to achieve leadership and generate the certs so that the liveness probe would pass.

Copy link
Contributor

@benashz benashz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

err := operator_leader.Become(ctx, "vault-k8s-leader")
if err != nil {
logger.Error("trouble becoming leader:", "error", err)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @tomhjp , it would a good idea to handle the error from Become() some how. Perhaps the injector should exit right away in this case, and let the scheduler handle the restart?

tvoran and others added 4 commits August 24, 2021 16:31
pod delete rbac and no more endpoints
Retries Become() 10 times (with exp backoff) and then signals the
caller to exit if it failed. command.Run() now watches an exitOnError
channel for that case.
@tvoran tvoran requested a review from tomhjp August 31, 2021 00:56
Copy link
Contributor

@tomhjp tomhjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I think the failure mode flow looks good, just one suggestion on the logging.

@@ -218,11 +221,12 @@ func (c *Command) Run(args []string) int {
}()
go func() {
select {
case <-exitOnError:
logger.Error("shutting down due to errors")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment above, I think this log message should include the error that triggered the shutdown.

Copy link
Contributor

@benashz benashz Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like on exitOnError does not result in a non-zero status being returned to the caller. I presumed that it would make its way to os.Exit().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like on exitOnError does not result in a non-zero status being returned to the caller. I presumed that it would make its way to os.Exit().

The flow here is:

  • c.shutdownHandler() calls server.Shutdown()
  • server.Shutdown() causes server.ListenAndServeTLS() to return an error, triggering a return 1 for this function:
    if err := server.ListenAndServeTLS(c.flagCertFile, c.flagKeyFile); err != nil {
    c.UI.Error(fmt.Sprintf("Error listening: %s", err))
    return 1
    }

And that makes its way back to os.Exit in main.go:

vault-k8s/main.go

Lines 17 to 21 in b73527f

exitStatus, err := c.Run()
if err != nil {
log.Println(err)
}
os.Exit(exitStatus)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment above, I think this log message should include the error that triggered the shutdown.

Added the error to the log message in a45b35e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I guess we are relying on the fact that server.Shutdown() is called before server.ListenAndServeTLS()?

if err != nil {
log.Error(fmt.Sprintf("error checking leader: %s", err))
log.Warn(fmt.Sprintf("Could not check leader: %s. Trying again...", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that some log lines are capitalized, and in other cases not. I guess we should probably be consistent either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think generally log messages should start with a capitalized word, and errors themselves should start with lowercase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree 👍

@tvoran tvoran merged commit 0c69acb into master Aug 31, 2021
@tvoran tvoran deleted the VAULT-2403/internal-leader branch August 31, 2021 21:52
cancelFunc()
case err := <-exitOnError:
logger.Error("Shutting down due to error", "error", err)
c.shutdownHandler(ctx, server, cancelFunc)
Copy link
Contributor

@benashz benashz Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.shutdownHandler(ctx, server, cancelFunc)
// we rely on a premature shutdown to fail the server, and cause the injector to abort.
c.shutdownHandler(ctx, server, cancelFunc)

Copy link
Contributor

@benashz benashz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Only one request to add comment for exitOnError

RemcoBuddelmeijer pushed a commit to RemcoBuddelmeijer/vault-k8s that referenced this pull request Feb 22, 2022
Using operator-sdk's Become() for leader election. Retries Become() 10
times (with exp backoff) and then signals the caller to exit if it
failed. command.Run() now watches an exitOnError channel for that
case.

Co-authored-by: Ben Ash <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants