Skip to content
This repository has been archived by the owner on Aug 30, 2019. It is now read-only.

Introduce new writers to API v0.2 #343

Merged
merged 9 commits into from
Jan 11, 2018
Merged

Introduce new writers to API v0.2 #343

merged 9 commits into from
Jan 11, 2018

Conversation

LotharSee
Copy link

@LotharSee LotharSee commented Dec 18, 2017

Split the Writer into 3 different writers, writing to 3 new endpoints:

  • /traces, TraceWriter: new protobuf endpoint. Only contains traces (and soon transactions).
  • /stats, StatsWriter: similar to the previous AgentPayload but without the traces.
  • /services, ServiceWriter: similar to before. But now a writer on its own.

They are split as they differ in terms of encoding, compression, replay logic, ...

To make that work properly, there is some heavy rewiring.

  • The trigger of a flush no longer comes from agent/agent.go. Traces and services are sent to their writer as soon as processed. The concentrator generates and flush StatsBuckets itself to the writer. Then writer are free to send it over the wire when/how they want (allowing future different flush conditions / replay features).
  • Revamp Endpoint API to be much simpler, simply writing bytes with HTTP headers. Also extract the client/proxy configuration.
  • New TracePayload is our first protobuf payload. For now it isn't compressed, we should benchmarks options before picking a compression.
  • Many lines changed around info to make it work with the new metrics / writers.

TODOs before being able to merge it:

  • figure out encoding for the trace endpoint
  • heavy real-world testing with compatible server trace api

@LotharSee LotharSee requested review from AlexJF and talwai December 18, 2017 14:53
Copy link

@AlexJF AlexJF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little less than halfway but leaving some nitpick comments already.

agent/agent.go Outdated
@@ -60,41 +60,62 @@ type Agent struct {
// NewAgent returns a new Agent object, ready to be started
func NewAgent(conf *config.AgentConfig, exit chan struct{}) *Agent {
dynConf := config.NewDynamicConfig()

// inter-component channels
rawTraceChan := make(chan model.Trace, 5000) // about 1000 traces/sec for 5 sec
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take this opportunity to remove some copies here too (chan *model.Trace)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model.Trace being a slice it isn't such a big deal.

https://github.com/golang/go/wiki/CodeReviewComments#receiver-type
But it falls under the:

If the receiver is a struct, array or slice and any of its elements is a pointer to something that might be mutating, prefer a pointer receiver, as it will make the intention more clear to the reader.

So it is mostly to make the code more explicit ; I'll leave it as a comment.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True about it just being a slice. Hadn't noticed. Comment was mostly because sampledTraceChan indeed uses *model.Trace and it felt a bit weird to have both styles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally ; I'm not sure yet about what is the best there / I'd rather take enough time to make the right decision. Left it as a comment, it can be a later task to review these questions across the entire Agent.

}
sort.Strings(c.aggregators)
return &c
}

// Start starts the writer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts the concentrator

@@ -148,7 +148,7 @@ func runAgent(exit chan struct{}) {

if opts.info {
if err := info.Info(os.Stdout, agentConf); err != nil {
// need not display the error, Info should do it already
os.Stdout.WriteString(fmt.Sprintf("failed to print info: %s\n", err))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less "should" more action 👍 😄

@@ -71,8 +71,6 @@ type HTTPReceiver struct {
func NewHTTPReceiver(conf *config.AgentConfig, dynConf *config.DynamicConfig) *HTTPReceiver {
// use buffered channels so that handlers are not waiting on downstream processing
return &HTTPReceiver{
traces: make(chan model.Trace, 5000), // about 1000 traces/sec for 5 sec
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be passing these by argument. Technically this is a constructor function so whatever comes out should be working without needing to modify unexported fields (IMO)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point! It both makes the code more explicit and avoid sneaky bugs. Updated.

agent/sampler.go Outdated
// NewScoreEngine creates a new empty sampler ready to be started
func NewScoreEngine(conf *config.AgentConfig) *Sampler {
// NewScoreSampler creates a new empty sampler ready to be started
func NewScoreSampler(conf *config.AgentConfig) *Sampler {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it kinda feels cleaner to me if we would pass sampled as an argument to the constructors.

agent/sampler.go Outdated
// NewPriorityEngine creates a new empty distributed sampler ready to be started
func NewPriorityEngine(conf *config.AgentConfig, dynConf *config.DynamicConfig) *Sampler {
// NewPrioritySampler creates a new empty distributed sampler ready to be started
func NewPrioritySampler(conf *config.AgentConfig, dynConf *config.DynamicConfig) *Sampler {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

agent/sampler.go Outdated
switch s.engine.(type) {
case *sampler.ScoreEngine:
info.UpdateSamplerInfo(info.SamplerInfo{EngineType: engineType, Stats: stats, State: state})
case *sampler.PriorityEngine:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cleaner as methods of engine. Let method resolution do its work.


// StatsPayload represents the payload to be flushed to the stats endpoint
type StatsPayload struct {
HostName string `json:"hostnamed"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo hostname ?

return err
}

// Set API key in the header and issue the request
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i realize this is copy-paste, but the comment disagrees with the code. We still send the api key as a query param rather than through headers.

Since the code is being touched maybe leave a TODO to really move it to HTTP headers. The new backend endpoints should support this as the main way to auth.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LotharSee LotharSee force-pushed the benjamin/new-writers branch 2 times, most recently from e5bbbf1 to 2575975 Compare December 19, 2017 09:53
@LotharSee LotharSee force-pushed the benjamin/new-writers branch from 5ae1eb1 to 1e367ac Compare December 19, 2017 10:33

receiver.traces = rawTraceChan
receiver.services = serviceChan
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant with arguments being passed to constructor.

Copy link

@AlexJF AlexJF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through it all. Besides the nitpick, only other observation is that we seem to have a lot of duplication around the writers (the whole start/stop/run, switch waiting for data, flushing, etc.). Perhaps having a BaseWriter composed into all of the writers and have the BaseWriter encode that common logic could work? Anyway, I can also explore this as part of the logic for the async retry stuff :)

LGTM

uint64 traceID = 1;
repeated Span spans = 2;
int64 startTime = 6;
int64 endTime = 7;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra space

@LotharSee
Copy link
Author

Yes that's the goal: I tried to keep writers as independent/uncoupled as possible ; that way it won't constraint us when we will have more writer-specific code (different buffering, retrying, logging, metrics, ...). After that we will be able to see what's in common and share it! (even if, sometimes, we can't avoid some dupe in Go).

}

headers := map[string]string{
languageHeaderKey: strings.Join(info.Languages(), "|"),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently updating our backend, as this header will be enough on the stats payload, not needed for service and traces.

@LotharSee LotharSee merged commit 33b1bab into master Jan 11, 2018
@dtilghman dtilghman deleted the benjamin/new-writers branch January 12, 2018 05:18
@palazzem palazzem added this to the 5.21.1 milestone Jan 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants