Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add cardinality stat to each node type. #1221

Merged
merged 37 commits into from
Mar 6, 2017
Merged

WIP: Add cardinality stat to each node type. #1221

merged 37 commits into from
Mar 6, 2017

Conversation

desa
Copy link
Contributor

@desa desa commented Feb 23, 2017

This PR introduces a new stat cardinality that tracks the number of groups that exist for a given node for each relevant node type.

  • Add new tests for cardinality tracking
  • Fill out k8s cardinality test
Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated

Copy link
Contributor

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set the stat in the node.go base type so that all nodes have the counter but it remains zero by default. Then each node can replace the value with their correct IntFuncGague?

Not sure that is a great idea but I want to make sure nodes are always consistent when it comes to having these standard stats.

expvar/expvar.go Outdated
@@ -46,6 +46,26 @@ func (v *Int) IntValue() int64 {
return atomic.LoadInt64(&v.i)
}

// IntGauge is a 64-bit integer variable that satisfies the expvar.Var interface.
type IntFuncGauge struct {
ValueF func() int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be func() int64 so that the function itself can handle the casting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

expvar/expvar.go Outdated
@@ -46,6 +46,26 @@ func (v *Int) IntValue() int64 {
return atomic.LoadInt64(&v.i)
}

// IntGauge is a 64-bit integer variable that satisfies the expvar.Var interface.
type IntFuncGauge struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment for the name IntFuncGague?

@desa
Copy link
Contributor Author

desa commented Feb 23, 2017

My only hesitation for putting it on the node is that its not relevant for all nodes. I suppose having a cardinality=0 for those nodes isn't that bad though.

I'll do that

@desa
Copy link
Contributor Author

desa commented Feb 27, 2017

@nathanielc assuming that latest commit goes green, it should be in a reviewable state.

There's a bit of ugliness around setting the n.nodeCardinality.ValueF because of a race.

@desa desa added review and removed in progress labels Feb 28, 2017
Copy link
Contributor

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you solve the race in setting ValueF? I have an idea of how it could work. I commented below.

node.go Outdated

nodeErrors *kexpvar.Int
nodeErrors *kexpvar.Int
nodeCardinality *kexpvar.IntFuncGauge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that this would not be a struct level field. Rather in the init method you would set a *kexpvar.Int var that is always zero. Then in each node they would call n.statsMap.Set(statCardinalityGague, <create their own IntFuncGague>) this way you avoid the race but still have a default 0 value for nodes that are always zero.

There is a tiny bit of waste creating the initial kexpvar.Int value since in most cases it is replaced but the GC will take care of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works and avoids the locking nonsense. I'll do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanielc fixed.

alert.go Outdated
@@ -558,7 +571,10 @@ func (a *AlertNode) runAlert([]byte) error {
var highestPoint *models.BatchPoint

var currentLevel alert.Level
if state, ok := a.states[b.Group]; ok {
a.cardinalityMu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is common and references a field on the alert node I think it would be best to wrap this into a small function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this pattern is pretty common throughout, should I add it to the other nodes that have this pattern as well?

combine.go Outdated
@@ -60,6 +65,14 @@ func (t timeList) Less(i, j int) bool { return t[i].Before(t[j]) }
func (t timeList) Swap(i, j int) { t[i], t[j] = t[j], t[i] }

func (n *CombineNode) runCombine([]byte) error {
valueF := func() int64 {
n.cardinalityMu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name this expressionsByGroupMu? I have typically named mutexes using the convention Mu so that it is obvious, or just mu if it is global to the entire struct.

Thoughts? Maybe we leave the name alone but above in the struct place the cardinalityMu next to the things being locked so its clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of naming it after the thing its locking. The reason I named it cardinalityMu was so that it would be universal across each of the nodes.

sample.go Outdated
@@ -73,10 +85,14 @@ func (s *SampleNode) shouldKeep(group models.GroupID, t time.Time) bool {
keepTime := t.Truncate(s.duration)
return t.Equal(keepTime)
} else {
s.cardinalityMu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should just get the write lock for the whole section here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

desa added 21 commits March 2, 2017 15:45
This commit only tracks JoinNode.groups, it might be worthwhile to add
stats for the following:

// Buffer for caching points that need to be matched with specific points.
matchGroupsBuffer map[models.GroupID][]srcPoint
// Buffer for caching specific points until their match arrivces.
specificGroupsBuffer map[models.GroupID][]srcPoint
We could consider finer grained locking inside of evalExpr, but I
figured grabbing a write lock around the whole thing would be
fine.
+	k.cardinalityMu.Lock()
 	newReplicas, err := k.evalExpr(state.current, group, k.k.Replicas, k.replicasExprs, k.replicasScopePool, t, fields, tags)
+	k.cardinalityMu.Unlock()
alert.go Outdated
@@ -580,7 +591,8 @@ func (a *AlertNode) runAlert([]byte) error {
var highestPoint *models.BatchPoint

var currentLevel alert.Level
if state, ok := a.states[b.Group]; ok {
state, ok := a.getAlertState(b.Group)
if ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go back inside the if statement now.

alert.go Outdated
@@ -488,7 +498,8 @@ func (a *AlertNode) runAlert([]byte) error {
return err
}
var currentLevel alert.Level
if state, ok := a.states[p.Group]; ok {
state, ok := a.getAlertState(p.Group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@desa desa merged commit 20627b3 into master Mar 6, 2017
@desa desa deleted the md-issue#1173 branch March 6, 2017 16:26
@desa desa removed the review label Mar 6, 2017
@desa desa restored the md-issue#1173 branch March 6, 2017 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants