-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Authorization Controller for NodeClass #7571
Conversation
✅ Deploy Preview for karpenter-docs-prod ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Pull Request Test Coverage Report for Build 12773797859Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
@@ -93,7 +94,7 @@ func (c *CloudProvider) Create(ctx context.Context, nodeClaim *karpv1.NodeClaim) | |||
if nodeClassReady.IsFalse() { | |||
return nil, cloudprovider.NewNodeClassNotReadyError(stderrors.New(nodeClassReady.Message)) | |||
} | |||
if nodeClassReady.IsUnknown() { | |||
if nodeClassReady.IsUnknown() && ctx.Value("DryRun") == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we checking dry-run here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because during the initial reconciliation nodeclass readiness is unknown so it will fail without a check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care about the dry-run value in that case? If the nodeclass has not been checked yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this check feels odd to me too -- it's leaking details into the implementation and I'm wondering if there are ways to inject details into the providers themselves or into the API layer so that we don't have to leak this detail here
@@ -0,0 +1,70 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an integration test for permission related failures?
Snapshot successfully published to
|
var errs []error | ||
//create checks createfleet, and describelaunchtemplate | ||
if _, err := n.cloudProvider.Create(ctx, nodeClaim); err != nil { | ||
errs = append(errs, fmt.Errorf("create: %w", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using "go.uber.org/multierr"
here
if err := n.instanceProvider.CreateTags(ctx, "mock-id", map[string]string{"mock-tag": "mock-tag-value"}); err != nil { | ||
errs = append(errs, fmt.Errorf("create tags: %w", err)) | ||
} | ||
if corecloudprovider.IsNodeClassNotReadyError(errors.Join(errs...)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we only set the condition to false
if the node class is not ready?
@@ -93,7 +94,7 @@ func (c *CloudProvider) Create(ctx context.Context, nodeClaim *karpv1.NodeClaim) | |||
if nodeClassReady.IsFalse() { | |||
return nil, cloudprovider.NewNodeClassNotReadyError(stderrors.New(nodeClassReady.Message)) | |||
} | |||
if nodeClassReady.IsUnknown() { | |||
if nodeClassReady.IsUnknown() && ctx.Value("DryRun") == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care about the dry-run value in that case? If the nodeclass has not been checked yet
@@ -93,7 +94,7 @@ func (c *CloudProvider) Create(ctx context.Context, nodeClaim *karpv1.NodeClaim) | |||
if nodeClassReady.IsFalse() { | |||
return nil, cloudprovider.NewNodeClassNotReadyError(stderrors.New(nodeClassReady.Message)) | |||
} | |||
if nodeClassReady.IsUnknown() { | |||
if nodeClassReady.IsUnknown() && ctx.Value("DryRun") == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this check feels odd to me too -- it's leaking details into the implementation and I'm wondering if there are ways to inject details into the providers themselves or into the API layer so that we don't have to leak this detail here
@@ -109,6 +110,9 @@ func (c *CloudProvider) Create(ctx context.Context, nodeClaim *karpv1.NodeClaim) | |||
} | |||
instance, err := c.instanceProvider.Create(ctx, nodeClass, nodeClaim, tags, instanceTypes) | |||
if err != nil { | |||
if awserrors.IsUnauthorizedError(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking: Is this here to handle the case where we get unauthorized and that causes us to fail the launch -- is the idea here that eventually the controller will reconcile and should block further launches (similar to what we do with other validation checks, where we expect the reconciliation to eventually catch-up and mark the NodeClass as NotReady)?
errs = append(errs, fmt.Errorf("create: %w", err)) | ||
} | ||
|
||
if err := n.instanceProvider.Delete(ctx, "mock-id"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we checking delete when it comes to determining whether we can launch an instance?
errs = append(errs, fmt.Errorf("create: %w", err)) | ||
} | ||
|
||
if err := n.instanceProvider.Delete(ctx, "mock-id"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're injecting dry-run into the context, which feels a little bit "hacky" to me -- stuffing values into context is always going to make the code handling that context a little awkward -- I wonder what happens if you wrap the EC2API with something like EC2DryRunAPI and it implements the EC2API but automatically mutates the call to pass the dry-run flag -- now you can inject this API into the providers and you don't have to mess with the context at all
Co-authored-by: Andrii Omelianenko <[email protected]>
…actions-deps group (aws#7612) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: APICodeGen <[email protected]>
Fixes #N/A
Description
A authorization check that checks for the proper permissions before marking the nodeclass ready
How was this change tested?
make presubmit
Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.