-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving leader election inside vault-k8s #271
Conversation
The v1 and v1beta1 AdmissionReview, AdmissionResponse, and AdmissionRequest are the same, so this bumps those to v1. Checks at the start of certWatcher() if AdmissionregistrationV1 exists, and falls back to v1beta1 if it doesn't.
And bail out if Become() fails, to prevent split-brain.
These are in a loop that retries once a second, and the TLS error is normal before the leader generates the cert Secret.
it needs to be more robust than a few retries
The v1 and v1beta1 AdmissionReview, AdmissionResponse, and AdmissionRequest are the same, so this bumps those to v1. Checks at the start of certWatcher() if AdmissionregistrationV1 exists, and falls back to v1beta1 if it doesn't.
and declare payload as a []byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, though one question on resilience to errors.
- "get" | ||
- "patch" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't read the whole change set yet, but what uses these permissions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of the Become()
call; it looks like there's some checking and cleanup around situations where an old leader pod was evicted or the underlying node is NotReady: https://github.com/operator-framework/operator-lib/blob/v0.4.1/leader/leader.go#L166-L189
err := operator_leader.Become(ctx, "vault-k8s-leader") | ||
if err != nil { | ||
logger.Error("trouble becoming leader:", "error", err) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we don't restart the Become
attempt, could we end up in a situation where none of the replicas are trying to become the leader, if they all error here? Seems like maybe we should have this error feed back into some sort of retry logic to guard against that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tomhjp , it would a good idea to handle the error from Become()
some how. Perhaps the injector should exit right away in this case, and let the scheduler handle the restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I added an infinite retry when there's an error in 75258f7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this, the more I think it would be better for this to retry a few times, then if Become() is still failing, throw a signal or something that would do a system exit non-zero. Otherwise folks would need to be watching the logs to know there was a problem with the replica.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that line of thinking 👍
leader/leader.go
Outdated
// become the leader when the current leader replica stops running, and | ||
// the Kubernetes garbage collector deletes the vault-k8s-leader | ||
// ConfigMap. | ||
err := operator_leader.Become(ctx, "vault-k8s-leader") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a node calls this with a leader already elected, does this return an error, nil, or blocks until context cancelation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it blocks forever until it can achieve leadership. The context passed in is used for k8s API calls, but I don't see it being checked for cancellation in Become(): https://github.com/operator-framework/operator-lib/blob/v0.4.1/leader/leader.go#L88
@@ -95,7 +67,7 @@ spec: | |||
port: 8080 | |||
scheme: HTTPS | |||
failureThreshold: 2 | |||
initialDelaySeconds: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change on initial delay for readiness and liveness mean that leader election is a bit longer to establish through this method, or is it to make it less flaky in general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it was just taking a little longer than a second while I was testing it locally to achieve leadership and generate the certs so that the liveness probe would pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
err := operator_leader.Become(ctx, "vault-k8s-leader") | ||
if err != nil { | ||
logger.Error("trouble becoming leader:", "error", err) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tomhjp , it would a good idea to handle the error from Become()
some how. Perhaps the injector should exit right away in this case, and let the scheduler handle the restart?
pod delete rbac and no more endpoints
Co-authored-by: Ben Ash <[email protected]>
Retries Become() 10 times (with exp backoff) and then signals the caller to exit if it failed. command.Run() now watches an exitOnError channel for that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I think the failure mode flow looks good, just one suggestion on the logging.
subcommand/injector/command.go
Outdated
@@ -218,11 +221,12 @@ func (c *Command) Run(args []string) int { | |||
}() | |||
go func() { | |||
select { | |||
case <-exitOnError: | |||
logger.Error("shutting down due to errors") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the comment above, I think this log message should include the error that triggered the shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like on exitOnError
does not result in a non-zero status being returned to the caller. I presumed that it would make its way to os.Exit()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like on exitOnError does not result in a non-zero status being returned to the caller. I presumed that it would make its way to os.Exit().
The flow here is:
c.shutdownHandler()
callsserver.Shutdown()
server.Shutdown()
causesserver.ListenAndServeTLS()
to return an error, triggering areturn 1
for this function:
vault-k8s/subcommand/injector/command.go
Lines 237 to 240 in b73527f
if err := server.ListenAndServeTLS(c.flagCertFile, c.flagKeyFile); err != nil { c.UI.Error(fmt.Sprintf("Error listening: %s", err)) return 1 }
And that makes its way back to os.Exit in main.go:
Lines 17 to 21 in b73527f
exitStatus, err := c.Run() | |
if err != nil { | |
log.Println(err) | |
} | |
os.Exit(exitStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the comment above, I think this log message should include the error that triggered the shutdown.
Added the error to the log message in a45b35e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I guess we are relying on the fact that server.Shutdown()
is called before server.ListenAndServeTLS()
?
if err != nil { | ||
log.Error(fmt.Sprintf("error checking leader: %s", err)) | ||
log.Warn(fmt.Sprintf("Could not check leader: %s. Trying again...", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that some log lines are capitalized, and in other cases not. I guess we should probably be consistent either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think generally log messages should start with a capitalized word, and errors themselves should start with lowercase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree 👍
cancelFunc() | ||
case err := <-exitOnError: | ||
logger.Error("Shutting down due to error", "error", err) | ||
c.shutdownHandler(ctx, server, cancelFunc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c.shutdownHandler(ctx, server, cancelFunc) | |
// we rely on a premature shutdown to fail the server, and cause the injector to abort. | |
c.shutdownHandler(ctx, server, cancelFunc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Only one request to add comment for exitOnError
Using operator-sdk's Become() for leader election. Retries Become() 10 times (with exp backoff) and then signals the caller to exit if it failed. command.Run() now watches an exitOnError channel for that case. Co-authored-by: Ben Ash <[email protected]>
This uses operator-sdk's Become() function to determine leadership for certificate generation, instead of relying on a leader-elector container as in #198.
The associated helm chart changes are in hashicorp/vault-helm#568
Depends on #273