Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm upgrade and SSL certificates #309

Closed
cyriltovena opened this issue Jul 28, 2018 · 6 comments · Fixed by #338
Closed

Helm upgrade and SSL certificates #309

cyriltovena opened this issue Jul 28, 2018 · 6 comments · Fixed by #338
Assignees
Labels
area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/bug These are bugs.
Milestone

Comments

@cyriltovena
Copy link
Collaborator

cyriltovena commented Jul 28, 2018

When updating using helm upgrade the certificates for the admission controller gets regenerated but the controller pod doesn’t get restarted and thus keep the old certificates which causes this error

GameServer instances failed (): Internal error occurred: failed calling admission webhook 
"mutations.stable.agones.dev": Post https://agones-controller-service.agones-
system.svc:443/mutate?timeout=30s: x509:certificate signed by unknown authority 
(possibly because of "crypto/rsa: verification error" while trying to verify candidate 
authority certificate "admission-controller-ca")

using helm upgrade --recreate-pods fixes the issue but we needs to document it.

@markmandel are we ok in recreating the controller pod at each upgrade ? Or do you have a better idea ?

@cyriltovena cyriltovena changed the title Helm upgrade and certificates SSL Helm upgrade and SSL certificates Jul 28, 2018
@markmandel
Copy link
Collaborator

Copy pasting chat transcripts, for posterity:

Mark Mandel [8:13 AM]
Also need to respond to your comment re: the internal ssl cert

I've run into that before as well - I usually hand delete the pod - but it's something we should discuss overall

Cyril Tovena [8:46 AM]
for now in my MR I forced the recreation of the pods
and it fixes the issue
I don't think the fix is easy

Mark Mandel [8:47 AM]
it's really only a development issue
in prod, you'll get a new tagged image, so the original pod will get deleted and replaced

Cyril Tovena [8:48 AM]
true unless you just want to change something else

Mark Mandel [8:48 AM]
oh true!

Cyril Tovena [8:48 AM]
but it's minor you're right I would not spend a lot of time for now

Mark Mandel [8:48 AM]
Is there a way in helm to only generate the cert if it's not there already?

Cyril Tovena [8:48 AM]
that's why I went for a recreate
I did not search more info, they have a solution with sha

Mark Mandel [8:49 AM]
yeah, we could also add a faq/troubleshooting guide to the development guide for this

Cyril Tovena [8:49 AM]
exactly
I think we can live with this, let's just mention it somewhere

Mark Mandel [8:49 AM]
Actually - having a troubleshooting section would be good anyway

Cyril Tovena [8:49 AM]
definitively

@markmandel
Copy link
Collaborator

@Kuqd did your recent #315 solve this problem as well, since you added --recreate-pods to the install target?

@cyriltovena
Copy link
Collaborator Author

cyriltovena commented Aug 18, 2018

Yes indeed, but it’s fixed only if you use the make target to install!

@cyriltovena
Copy link
Collaborator Author

@cyriltovena
Copy link
Collaborator Author

cyriltovena commented Aug 18, 2018

so I could use an annotation to force the restart of the pod, but the drawback is I need to merge the controller.yaml and admissionregistration.yaml to access the generated cert.(I tried to use include function but it gets re-evaluated.)

At this point I realized that a timestamp as annotation would have the same result, since certificates are regenerated all the time. (There is no way to generate only at install time).

So I think we should add a timestamp in the controller annotations(only if cert are generated), as this would be less error prone than a troubleshooting section and recommend for production user in the helm documentation to generate their own certificates, in that case the problem is non-existent and restart are avoided.

WDYT ?

@markmandel
Copy link
Collaborator

This idea sounds good to me!

@markmandel markmandel added kind/bug These are bugs. area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc labels Aug 24, 2018
@cyriltovena cyriltovena self-assigned this Aug 26, 2018
@markmandel markmandel modified the milestones: 0.4.0, 0.5.0 Aug 28, 2018
markmandel added a commit to markmandel/agones that referenced this issue Sep 6, 2018
The developer guide didn't cover what to do "next" after compiling
Agones -- i.e. how to make changes and test them.

This PR is an effort to fix this. Since googleforgames#309 is completed, you
don't have to manually delete the agones-controller pod, so
this also becomes much simpler.

Closes googleforgames#308
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/bug These are bugs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants