-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add error counters to every node type #1175
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general framework looks good as well as the nodes you have already done.
So far I like what you have done where if an error count already exists you just double count it, once in the generic error counter and once in the specific counter. Let's continue along that path for now.
alert/HANDLERS.md
Outdated
@@ -208,6 +208,7 @@ Add the following snippet to the `service.go` file. | |||
// Handle takes an event and posts its message to the Foo service chat room. | |||
func (h *handler) Handle(event alert.Event) { | |||
if err := h.s.Alert(h.c.Room, event.State.Message); err != nil { | |||
h.incrementErrorCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right as the handler is not a node and so will not have the incrementErrorCount method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol. May have went into autopilot and added this. Removed now.
node.go
Outdated
@@ -59,6 +60,9 @@ type Node interface { | |||
|
|||
emittedCount() int64 | |||
|
|||
// TODO: Better name | |||
incrementErrorCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think the name is fine.
node.go
Outdated
@@ -276,6 +282,11 @@ func (n *node) emittedCount() (count int64) { | |||
return | |||
} | |||
|
|||
// node increment error count increments a nodes error_count stat | |||
func (n *node) incrementErrorCount() { | |||
n.statMap.Add(statErrorCount, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid the map look up each time you should keep the errCountVar
from above on the node struct and then just increment it directly.
node.go
Outdated
@@ -20,6 +20,7 @@ import ( | |||
|
|||
const ( | |||
statAverageExecTime = "avg_exec_time_ns" | |||
statErrorCount = "error_count" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think errors
is a better name for the metric but I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like errors
its consistent with the other nodes that have error counts as well.
I think I've hit every node that has The command I'm running to check is
|
I did have a specific question on what to do in this case here |
@nathanielc anything else that's missing from here? (other than the rebase and changelog addition) |
@nathanielc any thoughts on what I should do here 419f612#diff-d410d6dce1320d105297cb9419d065e3R603 |
join.go
Outdated
@@ -599,6 +600,8 @@ BATCH_POINT: | |||
} | |||
b, ok := batch.(models.Batch) | |||
if !ok { | |||
// TODO: determine what to do here. No reference to node is available, but | |||
// a possible error case exists. cant make call to incrementErrorCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass the reference node down so it can be called.
@nathanielc Should be good to go. Let me know if there's anything else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few small changes requested.
join.go
Outdated
@@ -570,7 +571,7 @@ func (js *joinset) JoinIntoPoint() (models.Point, bool) { | |||
} | |||
|
|||
// join all batches the set into a single batch | |||
func (js *joinset) JoinIntoBatch() (models.Batch, bool) { | |||
func (js *joinset) JoinIntoBatch(j *JoinNode) (models.Batch, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather the JoinNode get passed into the newJoinset
function and become a field on the joinset type instead of passing it into each function of the joinset type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
k8s_autoscale.go
Outdated
@@ -17,7 +17,7 @@ import ( | |||
const ( | |||
statsK8sIncreaseEventsCount = "increase_events" | |||
statsK8sDecreaseEventsCount = "decrease_events" | |||
statsK8sErrorsCount = "errors" | |||
statsK8sErrorsCount = "k8s_errors" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just remove the k8s_errors stats and use the one counter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, should I do that for the other ones as well? In particular the batch
and eval
nodes?
8d94eb5
to
adcd75d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is good to go. Can you update the CHANGELOG entry to make note that the batch node and eval node error stat name changed and mark it as a BREAKING change.
Add error count to alert node Each instance of `E!` now increments the error count for the alert node. Add error count to batch node Each instance of `E!` now increments the error count. Add error count to combine node Each instance of `E!` now increments the error count for a combine node. Add error count to where node Each instance of `E!` now increments the error count Add error count to derivative node Each instance of `E!` now increments a the error count. Add error count to eval node Each instance of `E!` now increments the node error count. It should be noted that the eval node already tracks its own errors. Add error count to flatten node Each instance of `E!` now increments the nodes error count. Add error count to influxdb_out node Each instance of `E!` now increments the node's error count. Add error count to influxql node Each instance of `E!` now increments the nodes error count. Add error count to join node Note the TODO item. Add errors count to k8s autoscale node Note here that k8s already tracks it's own error count. Add error count to log node Each instance of `E!` now increments the error count Add error count to stream node Each insance of `E!` now increments the nodes error count Remove incrementErrorCount from handler Accidentally added it here. Change errors to k8s_errors in K8sAutoscaleNode Rename error_count to errors for node error count Add node errors to node struct Doing this prevents an unnecessary map lookup each time the error count is updated. Pass JoinNode pointer to joinset JoinIntoBatch To increment the nodes error count, a reference to the node must be passed down. Add entry to changelog Remove batch errors in favor of global node error Remove k8 errors in favor of global errors Remove JoinNode reference from JoinIntoBatch fn
Required for all non-trivial PRs