-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluent Forward Receiver #1173
Fluent Forward Receiver #1173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start, thanks.
My primary concern is about the usage of Append vs Resize and performance implications of that.
switch ev := event.(type) { | ||
case *protocol.MessageEvent: | ||
lr := convertMessageEvent(ev) | ||
logSlice.Append(lr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Append which calls Resize may be slow if we have large batches. Worth profiling for large batches (e.g. for thousands of events).
Alternatively, is it possible to use Resize upfront? Initially we know that the size is going to be in the ballpark of len(events)
, right?
If you use Resize you can't use Append anymore, you will need to use At() and track the index yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed Append to be as efficient as normal slice append
is in Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good that Append is fast however convertMessageEvent still returns a pointer, so we make a separate allocation per log record. If we call Resize before the for
loop there will be a single allocation for the entire batch.
Again, I do not think having an Append is a good idea generally for slice creation. It allows/encourages to write code that makes more allocations than the Resize version which forces to combine the allocations. I may be missing something, please correct me if I am wrong.
We could merge this PR as is since Append is likely not going to affect performance a lot in this case but I am worried that the Append-based approach will bite us in the future and we will not be able to remove Append due to compatibility reasons. Because of that unless you have other reasons to keep Append I think it is best to remove it although its impact in this particular case may be negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Append is now only used where the size of events cannot be determined upfront. Without Append in those cases, you would have to parse all of the events to some big slice of intermediate data structures and then do the Resize at the end, which is almost certainly going to result in worse performance due to all of the allocations of the intermediate data structures that would get thrown away.
} | ||
|
||
func convertMessageEvent(me *protocol.MessageEvent) *pdata.LogRecord { | ||
lr := pdata.NewLogRecord() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewLogRecord is slow since it does an allocation.
It is faster to accept as a function parameter the LogRecord that is returned by At() in the caller and populate it here instead of creating, returning it and Append-ing it. This is the intended way to use the API. The current usage will result in suboptimal performance and it is enabled by Append, so I would drop Append to avoid provoking such usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new implementation of Append will be roughly the same since it will only call NewLogRecord once now.
Codecov Report
@@ Coverage Diff @@
## master #1173 +/- ##
==========================================
- Coverage 90.50% 90.07% -0.44%
==========================================
Files 219 231 +12
Lines 15521 16033 +512
==========================================
+ Hits 14047 14441 +394
- Misses 1060 1136 +76
- Partials 414 456 +42
Continue to review full report at Codecov.
|
@@ -0,0 +1,25 @@ | |||
package v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think opentelemetry-proto-gen
directory contains generated files only. Can we move this file somewhere else?
|
||
// Configures the receiver to use TLS. | ||
// The default value is nil, which will cause the receiver to not use TLS. | ||
// TLSCredentials *configtls.TLSSetting `mapstructure:"tls, omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to be uncommented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, removing it for now. We can add back later if people really want TLS support but given the target use case of localhost traffic it seems excessive.
switch ev := event.(type) { | ||
case *protocol.MessageEvent: | ||
lr := convertMessageEvent(ev) | ||
logSlice.Append(lr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good that Append is fast however convertMessageEvent still returns a pointer, so we make a separate allocation per log record. If we call Resize before the for
loop there will be a single allocation for the entire batch.
Again, I do not think having an Append is a good idea generally for slice creation. It allows/encourages to write code that makes more allocations than the Resize version which forces to combine the allocations. I may be missing something, please correct me if I am wrong.
We could merge this PR as is since Append is likely not going to affect performance a lot in this case but I am worried that the Append-based approach will bite us in the future and we will not be able to remove Append due to compatibility reasons. Because of that unless you have other reasons to keep Append I think it is best to remove it although its impact in this particular case may be negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except the comment about tests.
otlpcommon "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/common/v1" | ||
logsproto "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/logs/v1" | ||
otlpresource "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/resource/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not advised to use otlp structs directly in component code or int component tests. OTLP structs may change over time (e.g. by using custom gogoproto data types). We provide wrappers for the purpose of encapsulating and isolating component code from such changes.
I highly advise to rewrite these tests not to use otlp structs at all. Work with pdata only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I added a test util to easily construct logs in a declarative manner using the pdata methods only.
Also @tigrannajaryan, wherever the size of the log records can be predetermined I got rid of the use of Append, but it is still in a few places where the event count cannot be determined upfront due to the msgpack protocol (e.g. packed forward events). |
This can receive logs in the [Fluent Forward Protocol](https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1) via either TCP or Unix sockets. It supports all of the required features of the spec. It currently does not support the optional handshake aspect of the protocol, but will acknowledge chunk options per the spec. All log entries received within a short time window of each other will be put into the same ResourceLog instance before being emitted to the next consumer in the pipline. The Resource object is currently not populated in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great job.
Do you feel that more coverage is needed anywhere?
@tigrannajaryan please make sure new code is added with the right coverage. Was a hard effort for me to improve it to this level |
* [demo] Increase Kafka memory limit and bump charts * Apply suggestion * Update examples
This can receive logs in the Fluent Forward Protocol
via either TCP or Unix sockets. It supports all of the required features of
the spec.
It currently does not support the optional handshake aspect of the protocol,
but will acknowledge chunk options per the spec.
All log entries received within a short time window of each other will be put
into the same ResourceLog instance before being emitted to the next consumer in
the pipline. The Resource object is currently not populated in any way.