Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event recorder to show chaos status. #245

Merged
merged 22 commits into from
Feb 25, 2020

Conversation

AngleNet
Copy link
Contributor

@AngleNet AngleNet commented Feb 16, 2020

What problem does this PR solve?

#162 Add event recorder to show chaos status after we have applied a resource.

What is changed and how does it work?

  • Support event recorder for chaos starting and completion

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Code changes

  • Has Go code change

Side effects

  • No

Related changes

  • Changes of signature of schedule reconciler.

Does this PR introduce a user-facing change?:

Add event recorder for chaos start and completion

@AngleNet AngleNet changed the title [WIP] Add event recorder to show chaos status. [DRAFT][WIP] Add event recorder to show chaos status. Feb 16, 2020
@codecov-io
Copy link

codecov-io commented Feb 16, 2020

Codecov Report

Merging #245 into master will increase coverage by 7.77%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #245      +/-   ##
==========================================
+ Coverage    52.9%   60.67%   +7.77%     
==========================================
  Files          34       36       +2     
  Lines        1603     2014     +411     
==========================================
+ Hits          848     1222     +374     
- Misses        668      698      +30     
- Partials       87       94       +7
Impacted Files Coverage Δ
controllers/timechaos_controller.go 0% <0%> (ø) ⬆️
pkg/chaosfs/server.go 82.25% <0%> (ø)
pkg/chaosfs/hook.go 96.51% <0%> (ø)
controllers/iochaos_controller.go 92.85% <0%> (+2.38%) ⬆️
controllers/podchaos_controller.go 93.1% <0%> (+2.62%) ⬆️
controllers/networkchaos_controller.go 93.1% <0%> (+2.62%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc904ad...db5c56c. Read the comment docs.

@AngleNet
Copy link
Contributor Author

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

@AngleNet
Copy link
Contributor Author

Another problem is that shall we keep track of normal execution of a chaos action? Should the event report the chaos progress?

@zhouqiang-cl
Copy link
Contributor

@AngleNet thank you, @cwen0 @Yisaer PTAL

@zhouqiang-cl
Copy link
Contributor

Another problem is that shall we keep track of normal execution of a chaos action?

I think we should

Should the event report the chaos progress?

I think the scheduler chaos can not show the progress, we did not know when to end

I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

@Yisaer @cwen0 @YangKeao what's your oponion?

@cwen0
Copy link
Member

cwen0 commented Feb 17, 2020

Another problem is that shall we keep track of normal execution of a chaos action? Should the event report the chaos progress?

In my option, I think we can record the event when the chaos starts and ends.

@cwen0
Copy link
Member

cwen0 commented Feb 17, 2020

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

About this question, If we gather all of the events around the chaos resource, the number of events will be very large and confusing. In my option, I think we can spread the events to the owned resources.

@AngleNet
Copy link
Contributor Author

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

About this question, If we gather all of the events around the chaos resource, the number of events will be very large and confusing. In my option, I think we can spread the events to the owned resources.

Make sense ! How about we gather events about chaos actions for begining and ending around the chaos resource? For the events from resources operated by chaos actions, we should let it untouched and K8s will generate events for them. Users can check them if they want.

@cwen0
Copy link
Member

cwen0 commented Feb 17, 2020

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

HI. Currently I only add event recorder for the controller-manager. Its events are still missing. I wonder whether we should gather all of events around the chaos resource or spread the events to the owned resources. I prefer the previous one.

About this question, If we gather all of the events around the chaos resource, the number of events will be very large and confusing. In my option, I think we can spread the events to the owned resources.

Make sense ! How about we gather events about chaos actions for begining and ending around the chaos resource? For the events from resources operated by chaos actions, we should let it untouched and K8s will generate events for them. Users can check them if they want.

We can record an event when to apply chaos action and record the other one when to cancel the action

@cwen0 cwen0 removed the status/PTAL label Feb 18, 2020
@hound hound bot deleted a comment from zhouqiang-cl Feb 18, 2020
@hound hound bot deleted a comment from AngleNet Feb 18, 2020
@hound hound bot deleted a comment from AngleNet Feb 18, 2020
@hound hound bot deleted a comment from AngleNet Feb 18, 2020
@hound hound bot deleted a comment from zhouqiang-cl Feb 18, 2020
@hound hound bot deleted a comment from cwen0 Feb 18, 2020
@hound hound bot deleted a comment from AngleNet Feb 18, 2020
@Yisaer
Copy link
Contributor

Yisaer commented Feb 20, 2020

As ChaosMesh users needs to know the every timestamp for performing and recovering the chaos action, I suggest that to add event in the certain chaos action controller like following logic:
In podfailure/types.go

// Apply implements the reconciler.InnerReconciler.Apply
func (r *Reconciler) Apply(ctx context.Context, req ctrl.Request, obj reconciler.InnerObject) error {

// Perform chaos action logic ...

r.Recorder.Event(chaos, v1.EventTypeNormal, utils.EventChaosPerformed, "podfaulre chaos perfromed,affect 3 pods")
}

// Recover implements the reconciler.InnerReconciler.Recover
func (r *Reconciler) Recover(ctx context.Context, req ctrl.Request, obj reconciler.InnerObject) error {

// Recovery chaos action logic

r.Recorder.Event(chaos, v1.EventTypeNormal, utils.EventChaosRecovered, "podfaulre chaos recovered, recover 3 pods")
}

In this way, the event would be sent for the podfailure chaos object each time it get performed and recovered without caring where the calling from (schedule chaos or common chaos) I think this would help you. @AngleNet

@AngleNet
Copy link
Contributor Author

Yes. It’s a good place for handling chaos starting and recovering event. However, I think we could still record failing events in the entry reconciler.

@Yisaer
Copy link
Contributor

Yisaer commented Feb 21, 2020

@AngleNet Remaining failure event in the entry reconciler is ok to me.

@zhouqiang-cl
Copy link
Contributor

@AngleNet thank you very much, we will take a look as soon as possible

Copy link
Contributor

@Yisaer Yisaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest lgtm.

controllers/iochaos/fs/types.go Outdated Show resolved Hide resolved
controllers/iochaos/fs/types.go Outdated Show resolved Hide resolved
Comment on lines +56 to +63
if err != nil {
if !chaos.IsDeleted() {
r.Event(chaos, v1.EventTypeWarning, utils.EventChaosInjectFailed, err.Error())
} else {
r.Event(chaos, v1.EventTypeWarning, utils.EventChaosRecoverFailed, err.Error())
}
}
return result, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic seems weird here. If the reconciliation failed, I suggest send event with reconciliation failed event type and add the error message into event message. It is not necessary to judge which kind of chaos action (inject or recover)

controllers/networkchaos/netem/types.go Outdated Show resolved Hide resolved
@@ -88,6 +94,7 @@ func (r *Reconciler) Apply(ctx context.Context, req ctrl.Request, chaos reconcil
return err
}

r.Event(networkchaos, v1.EventTypeNormal, utils.EventChaosToInject, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

controllers/podchaos_controller.go Show resolved Hide resolved
@@ -131,6 +127,7 @@ func (r *Reconciler) Recover(ctx context.Context, req ctrl.Request, chaos reconc
return err
}

r.Event(timechaos, v1.EventTypeNormal, utils.EventChaosToRecover, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

controllers/timechaos/types.go Outdated Show resolved Hide resolved
Comment on lines +58 to +63
if err != nil {
if !chaos.IsDeleted() {
r.Event(chaos, v1.EventTypeWarning, utils.EventChaosInjectFailed, err.Error())
} else {
r.Event(chaos, v1.EventTypeWarning, utils.EventChaosRecoverFailed, err.Error())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

pkg/utils/events.go Show resolved Hide resolved
Copy link
Member

@cwen0 cwen0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhouqiang-cl
Copy link
Contributor

@Yisaer Please take a final look 😝

@zhouqiang-cl zhouqiang-cl merged commit fe22eae into chaos-mesh:master Feb 25, 2020
sjwsl pushed a commit to sjwsl/chaos-mesh that referenced this pull request May 6, 2021
Add event recorder to show chaos status
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants