Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseboardManagement events and conditions #5

Merged
merged 7 commits into from
May 24, 2022

Conversation

pokearu
Copy link
Contributor

@pokearu pokearu commented May 18, 2022

Description

Modified BaseboardManagement conditions to have a Status .
True represents the condition is active and False indicates the condition was not met and message contains the reason.

Added support for events. If the BaseboardManagement reconciler failed to perform BMC actions, an event is set on the BaseboardManagement object with an error message.

Why is this needed

This is follow up from a discussion on pull/3.
It helps model failed actions as events that can be viewed by the user.

How Has This Been Tested?

Created a BMC user with Readonly permissions and tested failure.

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

@codecov-commenter
Copy link

codecov-commenter commented May 18, 2022

Codecov Report

Merging #5 (7851441) into main (4a1c506) will increase coverage by 4.02%.
The diff coverage is 66.66%.

❗ Current head 7851441 differs from pull request most recent head d41046b. Consider uploading reports for the commit d41046b to get more accurate results

@@            Coverage Diff             @@
##             main       #5      +/-   ##
==========================================
+ Coverage   54.46%   58.49%   +4.02%     
==========================================
  Files           4        4              
  Lines         112      106       -6     
==========================================
+ Hits           61       62       +1     
+ Misses         42       37       -5     
+ Partials        9        7       -2     
Impacted Files Coverage Δ
controllers/baseboardmanagement_controller.go 72.94% <66.66%> (+5.90%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a1c506...d41046b. Read the comment docs.

Copy link
Member

@chrisdoherty4 chrisdoherty4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial set of comments. I haven't looked at all the events we're triggering but will circle back with follow up shortly.

@@ -107,7 +110,7 @@ func (r *BaseboardManagementReconciler) Reconcile(ctx context.Context, req ctrl.
return r.reconcile(ctx, baseboardManagement, baseboardManagementPatch, logger)
}

func (r *BaseboardManagementReconciler) reconcile(ctx context.Context, bm *bmcv1alpha1.BaseboardManagement, bmPatch client.Patch, logger logr.Logger) (ctrl.Result, error) {
func (r *BaseboardManagementReconciler) reconcile(ctx context.Context, bm *bmcv1alpha1.BaseboardManagement, bmPatch client.Patch, logger logr.Logger) (_ ctrl.Result, reterr error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can drop the return parameter names - they're unnecessary and can incite side effects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually did this to aggregate the individual field reconciler errors and return a final one 147.

We could have a global error instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want a global error; avoid globals (I can elaborate offline).

We can just define the error variable instead of letting the return parameter declare the variable. This avoids confusion and accidental behavior. In general, avoid named returns. They're only really necessary when you return 2 values of the same type and even then you might consider adjusting your function first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I just read what I wrote again 🤦
I actually in my head meant a variable in the function scope that has the aggregated errors.
And removed the named returns 👍

reterr = utilerrors.NewAggregate([]error{err, reterr})
}

return result, reterr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return result, reterr
return result, utilerrors.Flatten(reterr)

@@ -119,7 +122,7 @@ func (r *BaseboardManagementReconciler) reconcile(ctx context.Context, bm *bmcv1
bmcClient, err := r.bmcClientFactory(ctx, bm.Spec.Connection.Host, strconv.Itoa(bm.Spec.Connection.Port), username, password)
if err != nil {
logger.Error(err, "BMC connection failed", "host", bm.Spec.Connection.Host)
result, setConditionErr := r.setCondition(ctx, bm, bmPatch, bmcv1alpha1.ConnectionError, err.Error())
result, setConditionErr := r.setCondition(ctx, bm, bmPatch, bmcv1alpha1.Connected, bmcv1alpha1.BaseboardManagementConditionFalse, err.Error())
Copy link
Member

@chrisdoherty4 chrisdoherty4 May 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we break this apart into 2 calls rather than bundling into 1 so it only does 1 thing and is consistent with reconciliation funcs. Right now it seems we're jumping between levels of abstraction.

if err != nil {
  // Mutate the BaseboardManagement type.
  r.setCondition(bm, bmcv1alpha1.Connected, bmcv1alpha1.BaseboardManagementConditionFalse, err.Error())

  // Apply the patch separately.
  if err := r.applyPatch(ctx, bmPatch, bm); err != nil {
     // Handle
  }
}

Its probably useful to define the setCondition on the BaseboardManagement type as a helper method also and just call that.

bm.SetCondition(bmc.v1alpha1.Connected, bmcv1alpha1.BaseboardManadementConditionFalse, err.Error())

And if we go the full mile, this method should have a variadic option set because the message is optional and we're forcing consumers to specify "" at the end if no message is required, which is nasty. Its a little more code, but a better API.

// api/v1alpha1 pkg

type BaseboardManagementSetConditionOption func(*BaseboardManagementCondition)

func (bmc BaseboardManagement) SetCondition(condition BaseboardManagementConditionType, status BaseboardManagementConditionStatus, opt ...BaseboardManagementSetConditionOption) error {
  ...
}

func WithConditionMessage(m string) BaseboardManagementSetConditionOption {
  return func(c *BaseboardManagementCondition) {
    c.Message = m
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍 I like the idea of having SetCondition on the types itself and having the options.
Will follow a similar model for all the types, it makes it simpler.

@@ -165,6 +175,7 @@ func (r *BaseboardManagementReconciler) reconcilePower(ctx context.Context, bm *
// Setting baseboard management to desired power state
_, err = bmcClient.SetPowerState(ctx, string(bm.Spec.Power))
if err != nil {
r.recorder.Event(bm, corev1.EventTypeWarning, "SetPowerState", fmt.Sprintf("failed to set power state: %v", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can rejig the warnings a little.

Suggested change
r.recorder.Event(bm, corev1.EventTypeWarning, "SetPowerState", fmt.Sprintf("failed to set power state: %v", err))
r.recorder.Eventf(bm, corev1.EventTypeWarning, EventSetPowerStateFailed, "reconciling power: %v", err)

Then define some event constants.

const (
  EventGetPowerStateFailed = "GetPowerStateFailed"
  EventSetPowerStateFailed = "SetPowerStateFailed"
)

@pokearu pokearu force-pushed the bm-events-conditions branch from 6bc2d77 to aa8d7bd Compare May 21, 2022 09:08
Copy link
Member

@chrisdoherty4 chrisdoherty4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Some simplifications and clarifications particularly for the comment about continuously updating the condition.

Comment on lines 118 to 139
currentConditions := bmj.Status.Conditions
for i := range currentConditions {
// If condition exists, update
if currentConditions[i].Type == cType {
bmj.Status.Conditions[i].Status = status
for _, opt := range opts {
opt(&currentConditions[i])
}
return
}
}

// Append new condition to Conditions
condition := BaseboardManagementCondition{
Type: cType,
Status: status,
}
for _, opt := range opts {
opt(&condition)
}

bmj.Status.Conditions = append(bmj.Status.Conditions, condition)
Copy link
Member

@chrisdoherty4 chrisdoherty4 May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm can be simplified a little.

Suggested change
currentConditions := bmj.Status.Conditions
for i := range currentConditions {
// If condition exists, update
if currentConditions[i].Type == cType {
bmj.Status.Conditions[i].Status = status
for _, opt := range opts {
opt(&currentConditions[i])
}
return
}
}
// Append new condition to Conditions
condition := BaseboardManagementCondition{
Type: cType,
Status: status,
}
for _, opt := range opts {
opt(&condition)
}
bmj.Status.Conditions = append(bmj.Status.Conditions, condition)
var condition *BaseboardManagementCondition
// Check if there's an existing condition.
for i, c := range bmj.Status.Conditions {
if c.Type == cType {
condition = &bmj.Status.Conditions[i]
break
}
}
// We didn't find an existing condition so create a new one and append it.
if condition == nil {
c := BaseboardManagementCondition{Type: cType}
bmj.Status.Conditions = append(bmj.Status.Conditions, c)
condition = &bmj.Status.Conditions[len(bmj.Status.Conditions)-1)
}
condition.Status = status
for _, opt := range opts {
opt(condition)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice! Thank you 👍

api/v1alpha1/baseboardmanagement_types.go Outdated Show resolved Hide resolved
api/v1alpha1/baseboardmanagement_types.go Outdated Show resolved Hide resolved
result, setConditionErr := r.setCondition(ctx, bm, bmPatch, bmcv1alpha1.ConnectionError, err.Error())
if setConditionErr != nil {
return result, utilerrors.NewAggregate([]error{fmt.Errorf("failed to set conditions: %v", setConditionErr), err})
bm.SetCondition(bmcv1alpha1.Connected, bmcv1alpha1.BaseboardManagementConditionFalse, bmcv1alpha1.WithJobConditionMessage(err.Error()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to update this every reconciliation call? This means the condition doesn't represent a particular state at a point in time that we've passed through and instead represents the live state. Live state is typically populated directly on the Status field, not as a condition.

@micahhausler

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted offline and concluded conditions are flexible. Some conditions like the Node types DiskPressure condition are continually updated while others like the Pod types Initialized or Ready conditions are updated as the condition becomes true and we pass that point in time.

We're looking to maintain continuous updating of the condition on reconciliation requests and changing the condition name to Contactable to clarify that its stateless in so far as there's no session state maintained from 1 update to the next.

Copy link
Member

@chrisdoherty4 chrisdoherty4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to merge.

@pokearu pokearu mentioned this pull request May 23, 2022
3 tasks
@chrisdoherty4 chrisdoherty4 merged commit 13eef9a into tinkerbell:main May 24, 2022
@pokearu pokearu mentioned this pull request Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants