-
Notifications
You must be signed in to change notification settings - Fork 329
Conversation
6f909bb
to
21ceba1
Compare
@odeke-em @rakyll still waiting on the spec to be merged (census-instrumentation/opencensus-specs#179) but what do you think of the approach? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you @Ramonza for working on this! I've posted some comments, PTAL.
stats/doc.go
Outdated
the current context. Tags from the current context are recorded with the | ||
measurements if they are any. | ||
|
||
Recorded measurements are dropped immediately if user is not aggregating | ||
them via views. Users don't necessarily need to conditionally enable/disable | ||
Recorded measurements are dropped immediately if no views are registered for then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/then/them/
but we could improve that sentence by saying
Recorded measurements are dropped immediately if they don't have any associated views.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the current phrasing is better since it is very specific in saying "registered" (which is the name of the function) while "associated" is more vague.
stats/view/aggregation_data.go
Outdated
Max float64 // max value in the distribution | ||
Mean float64 // mean of the distribution | ||
SumOfSquaredDev float64 // sum of the squared deviation from the mean | ||
// Deprecated: Replaced by Count in Buckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps let's put this field at the bottom and a blank space before it to draw attention to it like this:
bounds []float64 // histogram distribution of the values
Buckets []Bucket
// Deprecated: Replaced by Count in Buckets.
CountPerBucket []int64 // Number of occurances per bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of breaking this API, can we just do the following?
CountPerBucket []int64 // number of occurrences per bucket
ExemplarPerBucket []*exemplar.Exemplar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think this makes sense especially given that most Exporters won't care about exemplars.
stats/view/aggregation_data.go
Outdated
// sampled trace attachment, if neither have a trace attachment, pick the | ||
// one with more attachments. | ||
_, haveTraceID := e.Attachments["TraceID"] | ||
if haveTraceID || len(e.Attachments) >= len(b.Exemplar.Attachments) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we meant to use >
instead of >=
otherwise it'll always create a new Exempler if !haveTraceID && len(e.Attachments) == len(b.Exemplar.Attachments)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the idea is that we replace the existing exemplar with the newer one if they are equally "good" under the assumption that newer is better.
stats/view/aggregation_data.go
Outdated
c.CountPerBucket = counts | ||
buckets := make([]Bucket, len(a.Buckets)) | ||
copy(buckets, a.Buckets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only changes the slices but the underlying ExemplarData per bucket is the same
since the ExemplarData pointer is unchanged so this is a shallow copy. Copying CountsPerBucket copies int64s which are just values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but we don't expect anyone to mutate the map at this stage (it's only mutated when the exemplar is first created) so I think shallow copying in this case is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a type for attachments and added this as a note on the type godoc.
trace/exemplar.go
Outdated
|
||
func attachSpanContext(ctx context.Context, attachments map[string]string) map[string]string { | ||
span := FromContext(ctx) | ||
sc := span.SpanContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might result in a nil dereference on the first span(if SpanContext doesn't check that it is nil), but anyways we don't use it if span == nil. Please move it below after we've checked if span == nil
trace/exemplar_test.go
Outdated
if got, want := exemplar.Value, 1.5; got != want { | ||
t.Fatalf("exemplar.Value = %v; got %v", got, want) | ||
} | ||
if _, ok := exemplar.Attachments["TraceID"]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please use ExemplarTraceID and ExemplerSpanID, etc.. as the keys in these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the keys from the spec. I think since they map is already inside the exemplar, repeating the work "exemplar" doesn't really add anything.
exemplar/extractor.go
Outdated
// For example, a trace exemplar contains key-value pairs representing a | ||
// SpanContext and points to an example span that resulted in a given | ||
// measurement. | ||
type Exemplar struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am curious why we have two very similar structs Exemplar
and ExemplarData
which vary by only one field Value
, but also this definition of Exemplar seems only suited to traces yet we expect them to work for both traces and metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will try to consolidate. The problem is that before they are aggregated into views each exemplar applies to multiple values (if you record multiple measures at a time) while after they are put into views you need to keep only a single associated value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
stats/view/aggregation_data.go
Outdated
|
||
// ExemplarData is an example data point associated with each bucket. | ||
type ExemplarData struct { | ||
Value float64 // the value that was recorded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I was actually thinking to comment about making this a float64, but you already did it.
|
||
for i, b := range a.bounds { | ||
if val < b { | ||
count = &a.CountPerBucket[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now wonder why bounds, Bucket aren't of type map[int]float64
.
I believe it would save on:
- Potential runtime crashes from out-of-bounds index access with slices
- Having to understand that the zeroth bucket is for the case when we don't have any bounds
- Save on the pointer manipulation
- Save on upfront memory allocations e.g. if we had a program using lots of views with DistributionData
- [index] access would automatically create zero values
To illustrate this point example this function would simply become
func (a *DistributionData) addToBucket(val float64, ex *exemplar.Exemplar) {
valBucketIndex := len(a.bounds)
for i, b := range a.bounds {
if val < b {
valBucketIndex = i
break
}
}
bucket := a.Buckets[valBucketIndex]
bucket.Count += 1
a.CountPerBucket[valBucketIndex] = bucket.Count
bucket.maybeRetainExemplar(val, ex)
}
Although perhaps it is too late to change the types of Count* and Bucket* but it is something to keep in mind for future designs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unrelated to the current PR, right?
21ceba1
to
472ee59
Compare
2ff4d8d
to
5828636
Compare
06cb921
to
fac56d1
Compare
exemplar/extractor.go
Outdated
// | ||
// Their purpose it to provide an example of the kind of thing | ||
// (request, RPC, trace span, etc.) that resulted in that measurement. | ||
package exemplar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a public API or should it be internal for now?
What are the public use cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Exemplar type is needed for exporters and the AttachmentExtractor is meant to be an extensible mechanism (see comments on the spec PR about ways to extract attachments from Tracestate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more questions...
exemplar/extractor.go
Outdated
func AttachmentsFromContext(ctx context.Context) Attachments { | ||
var a Attachments | ||
for _, extractor := range extractors { | ||
a = extractor(ctx, a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return once we found something instead of keep looping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have more than one extractor return valid attachments. For example, if there are tags in the context as well as a sampled trace.
stats/view/aggregation_data.go
Outdated
return | ||
} | ||
func (a *DistributionData) addToBucket(e *exemplar.Exemplar) { | ||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var count *int64, bucket *Bucket
This grouping style is not very common in Go apart from the Docker repo and adds lots of whitespace.
exemplar/extractor.go
Outdated
KeyPrefixTag = "tag:" | ||
) | ||
|
||
// Exemplar is an example data point associated with each bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean examplar is only available for distribution aggregation? exemplar package don't have any bucket concepts. We might want to reword this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is only available for distribution aggregation. I will make that more explicit.
exemplar/extractor.go
Outdated
|
||
// Exemplar is an example data point associated with each bucket. | ||
type Exemplar struct { | ||
Value float64 // the value that was recorded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not ValueInt64 and ValueFloat64?
We currently internally cast everything to float64 but we can fix it in the future. It might be good not to expose this in public APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how it is in the metrics proto: https://github.com/census-instrumentation/opencensus-proto/blob/a74b002d33b4471aff7288afdb41393f6ff02ca1/src/opencensus/proto/metrics/v1/metrics.proto
) | ||
|
||
// AggregationData represents an aggregated value from a collection. | ||
// They are reported on the view data during exporting. | ||
// Mosts users won't directly access aggregration data. | ||
type AggregationData interface { | ||
isAggregationData() bool | ||
addSample(v float64) | ||
addSample(e *exemplar.Exemplar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that confuses this API is that exemplars only make sense in distribution aggregations but are represented in the core APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is a bit odd. The reason for this diff (removing the v argument) is that it is repeated inside the exemplar.
Regarding the presence of exemplars in the core APIs (it being a top-level package), I am open to moving it under "stats" or even "stats/view". But then we would have a direct dependency from the trace package to a package under "stats".
stats/view/aggregation_data.go
Outdated
Max float64 // max value in the distribution | ||
Mean float64 // mean of the distribution | ||
SumOfSquaredDev float64 // sum of the squared deviation from the mean | ||
// Deprecated: Replaced by Count in Buckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of breaking this API, can we just do the following?
CountPerBucket []int64 // number of occurrences per bucket
ExemplarPerBucket []*exemplar.Exemplar
stats/view/aggregation_data.go
Outdated
// Bucket represents a single bucket value in a distribution. | ||
type Bucket struct { | ||
Count int64 // Number of recorded measurements falling into this bucket. | ||
Exemplar *exemplar.Exemplar // Exemplar of this bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect one exemplar per bucket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
59593cb
to
4769ca7
Compare
See: stats/Exemplars.md in opencensus-specs Fixes: census-instrumentation#873
@rakyll updated, PTAL |
@@ -151,7 +155,12 @@ func (cmd *recordReq) handleCommand(w *worker) { | |||
} | |||
ref := w.getMeasureRef(m.Measure().Name()) | |||
for v := range ref.views { | |||
v.addSample(cmd.tm, m.Value()) | |||
e := &exemplar.Exemplar{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new exemplar
for every measurement
is very wrong. Going to revert this change.
Fixes: #873