-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add cardinality stat to each node type. #1221
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we set the stat in the node.go base type so that all nodes have the counter but it remains zero by default. Then each node can replace the value with their correct IntFuncGague?
Not sure that is a great idea but I want to make sure nodes are always consistent when it comes to having these standard stats.
expvar/expvar.go
Outdated
@@ -46,6 +46,26 @@ func (v *Int) IntValue() int64 { | |||
return atomic.LoadInt64(&v.i) | |||
} | |||
|
|||
// IntGauge is a 64-bit integer variable that satisfies the expvar.Var interface. | |||
type IntFuncGauge struct { | |||
ValueF func() int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be func() int64
so that the function itself can handle the casting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
expvar/expvar.go
Outdated
@@ -46,6 +46,26 @@ func (v *Int) IntValue() int64 { | |||
return atomic.LoadInt64(&v.i) | |||
} | |||
|
|||
// IntGauge is a 64-bit integer variable that satisfies the expvar.Var interface. | |||
type IntFuncGauge struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the comment for the name IntFuncGague
?
My only hesitation for putting it on the node is that its not relevant for all nodes. I suppose having a I'll do that |
@nathanielc assuming that latest commit goes green, it should be in a reviewable state. There's a bit of ugliness around setting the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you solve the race in setting ValueF? I have an idea of how it could work. I commented below.
node.go
Outdated
|
||
nodeErrors *kexpvar.Int | ||
nodeErrors *kexpvar.Int | ||
nodeCardinality *kexpvar.IntFuncGauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that this would not be a struct level field. Rather in the init method you would set a *kexpvar.Int var that is always zero. Then in each node they would call n.statsMap.Set(statCardinalityGague, <create their own IntFuncGague>)
this way you avoid the race but still have a default 0 value for nodes that are always zero.
There is a tiny bit of waste creating the initial kexpvar.Int value since in most cases it is replaced but the GC will take care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works and avoids the locking nonsense. I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nathanielc fixed.
alert.go
Outdated
@@ -558,7 +571,10 @@ func (a *AlertNode) runAlert([]byte) error { | |||
var highestPoint *models.BatchPoint | |||
|
|||
var currentLevel alert.Level | |||
if state, ok := a.states[b.Group]; ok { | |||
a.cardinalityMu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is common and references a field on the alert node I think it would be best to wrap this into a small function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this pattern is pretty common throughout, should I add it to the other nodes that have this pattern as well?
combine.go
Outdated
@@ -60,6 +65,14 @@ func (t timeList) Less(i, j int) bool { return t[i].Before(t[j]) } | |||
func (t timeList) Swap(i, j int) { t[i], t[j] = t[j], t[i] } | |||
|
|||
func (n *CombineNode) runCombine([]byte) error { | |||
valueF := func() int64 { | |||
n.cardinalityMu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name this expressionsByGroupMu? I have typically named mutexes using the convention Mu so that it is obvious, or just mu
if it is global to the entire struct.
Thoughts? Maybe we leave the name alone but above in the struct place the cardinalityMu next to the things being locked so its clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of naming it after the thing its locking. The reason I named it cardinalityMu
was so that it would be universal across each of the nodes.
sample.go
Outdated
@@ -73,10 +85,14 @@ func (s *SampleNode) shouldKeep(group models.GroupID, t time.Time) bool { | |||
keepTime := t.Truncate(s.duration) | |||
return t.Equal(keepTime) | |||
} else { | |||
s.cardinalityMu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should just get the write lock for the whole section here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call.
This commit only tracks JoinNode.groups, it might be worthwhile to add stats for the following: // Buffer for caching points that need to be matched with specific points. matchGroupsBuffer map[models.GroupID][]srcPoint // Buffer for caching specific points until their match arrivces. specificGroupsBuffer map[models.GroupID][]srcPoint
We could consider finer grained locking inside of evalExpr, but I figured grabbing a write lock around the whole thing would be fine. + k.cardinalityMu.Lock() newReplicas, err := k.evalExpr(state.current, group, k.k.Replicas, k.replicasExprs, k.replicasScopePool, t, fields, tags) + k.cardinalityMu.Unlock()
alert.go
Outdated
@@ -580,7 +591,8 @@ func (a *AlertNode) runAlert([]byte) error { | |||
var highestPoint *models.BatchPoint | |||
|
|||
var currentLevel alert.Level | |||
if state, ok := a.states[b.Group]; ok { | |||
state, ok := a.getAlertState(b.Group) | |||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go back inside the if statement now.
alert.go
Outdated
@@ -488,7 +498,8 @@ func (a *AlertNode) runAlert([]byte) error { | |||
return err | |||
} | |||
var currentLevel alert.Level | |||
if state, ok := a.states[p.Group]; ok { | |||
state, ok := a.getAlertState(p.Group) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this one too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
This PR introduces a new stat
cardinality
that tracks the number of groups that exist for a given node for each relevant node type.Required for all non-trivial PRs