-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat][streaming] - Fix for streaming input handling of invalid or empty websocket messages #42036
Conversation
… case of a websocket error
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
This pull request doesn't have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposed commit message needs a clearer explanation of what is being fixed; it's not the situation that's being fixed, it's the response to the situation, so describe that and say what the behaviour is and then how that is rectified.
CHANGELOG.next.asciidoc
Outdated
@@ -195,6 +195,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff] | |||
- Rate limiting fixes in the Okta provider of the Entity Analytics input. {issue}40106[40106] {pull}41583[41583] | |||
- Redact authorization headers in HTTPJSON debug logs. {pull}41920[41920] | |||
- Further rate limiting fix in the Okta provider of the Entity Analytics input. {issue}40106[40106] {pull}41977[41977] | |||
- Fixed a scenario in the streaming input where CEL would process invalid/empty messages in case of a websocket error. {pull}42036[42036] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fixed a scenario in the streaming input where CEL would process invalid/empty messages in case of a websocket error. {pull}42036[42036] | |
- Fix streaming input handling of invalid or empty websocket messages. {pull}42036[42036] |
@@ -116,6 +116,7 @@ func (s *websocketStream) FollowStream(ctx context.Context) error { | |||
return ctx.Err() | |||
default: | |||
_, message, err := c.ReadMessage() | |||
s.metrics.receivedBytesTotal.Add(uint64(len(message))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move this up? Is a message valid when err
is non-nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we still count the received bytes irrespective of whether it's a valid message or not, since this is a metric ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this depends on what the understood semantics of the metric are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is not a core part of the change for this PR, I've reverted this now, will take a look at it once more when I do some cleanups.
} else { | ||
state["response"] = message | ||
s.log.Debugw("received websocket message", logp.Namespace("websocket"), "msg", string(message)) | ||
err = s.process(ctx, state, s.cursor, s.now().In(time.UTC)) | ||
if err != nil { | ||
s.metrics.errorsTotal.Inc() | ||
s.log.Errorw("failed to process and publish data", "error", err) | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else { | |
state["response"] = message | |
s.log.Debugw("received websocket message", logp.Namespace("websocket"), "msg", string(message)) | |
err = s.process(ctx, state, s.cursor, s.now().In(time.UTC)) | |
if err != nil { | |
s.metrics.errorsTotal.Inc() | |
s.log.Errorw("failed to process and publish data", "error", err) | |
return err | |
} | |
continue | |
} | |
state["response"] = message | |
s.log.Debugw("received websocket message", logp.Namespace("websocket"), "msg", string(message)) | |
err = s.process(ctx, state, s.cursor, s.now().In(time.UTC)) | |
if err != nil { | |
s.metrics.errorsTotal.Inc() | |
s.log.Errorw("failed to process and publish data", "error", err) | |
return err | |
} |
but why does the error case not return if !isRetryableError(err)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6, It does return, the issue is when it is a RetryableError. In this scenario the message is still passed to CEL during the retry attempts. So if the 3rd attempt is successful the 1st 2 invalid messages are still processed by CEL leading to downstream errors in integration pipelines because the event itself might be malformed or not present. Also this is just unnecessary processing that can be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I felt that an else block had better readability in this case since we already have a bunch of returns within the "if".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does return, the issue is when it is a RetryableError
Sorry, you are correct.
Suggest then
if err != nil {
s.metrics.errorsTotal.Inc()
if !isRetryableError(err) {
s.log.Errorw("failed to read websocket data", "error", err)
return err
}
s.log.Debugw("websocket connection encountered an error, attempting to reconnect...", "error", err)
// close the old connection and reconnect
if err := c.Close(); err != nil {
s.metrics.errorsTotal.Inc()
s.log.Errorw("encountered an error while closing the websocket connection", "error", err)
}
// since c is already a pointer, we can reassign it to the new connection and the defer func will still handle it
c, resp, err = connectWebSocket(ctx, s.cfg, url, s.log)
handleConnectionResponse(resp, s.metrics, s.log)
if err != nil {
s.metrics.errorsTotal.Inc()
s.log.Errorw("failed to reconnect websocket connection", "error", err)
return err
}
continue
}
state["response"] = message
s.log.Debugw("received websocket message", logp.Namespace("websocket"), "msg", string(message))
err = s.process(ctx, state, s.cursor, s.now().In(time.UTC))
if err != nil {
s.metrics.errorsTotal.Inc()
s.log.Errorw("failed to process and publish data", "error", err)
return err
}
Note also that the debugw call with the message allocates non-conditionally due to the string
conversion. This can be avoided by providing a lazy fmt.Stringer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep the fmt.Stringer change for a later cleanup PR then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is
type bytesStringer []byte
func (b bytesStringer) String() string { return string(b) }
with
s.log.Debugw("received websocket message", logp.Namespace("websocket"), "msg", bytesStringer(message))
@efd6, updated the PR |
…r empty websocket messages (#42036) (#42048) * Fix for streaming input handling of invalid or empty websocket messages (cherry picked from commit d508a40) Co-authored-by: ShourieG <[email protected]>
…ut handling of invalid or empty websocket messages (#42049) * [filebeat][streaming] - Fix for streaming input handling of invalid or empty websocket messages (#42036) * Fix for streaming input handling of invalid or empty websocket messages (cherry picked from commit d508a40) * Update CHANGELOG.next.asciidoc --------- Co-authored-by: ShourieG <[email protected]>
…ut handling of invalid or empty websocket messages (#42047)
Type of change
Proposed commit message
Fix for streaming input handling of invalid or empty websocket messages.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
None
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs