-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve decompression within ParseProtoReader. #3682
Conversation
This uses the last weaveworks, which include a fix for the content-type size to be set correctly which allows to allocate buffers correctly instead of let it grow naturally. I've also implemented an alternative decompression from a `bytes.Buffer` which avoids allocating and reading a new buffer when using httpgrpc since the underlaying reader is already a `bytes.Buffer`. I took the liberty to also improve the validation when using RawSnappy, I'm checking the length using the snappy header. This will avoid wasting CPU when the decompressed size is too big since we will know in advance. I've added also missing tests that covers the old and new code. Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Should note weaveworks/common#204, Allow specifying JAEGER_ENDPOINT, in the changelog. Do you think we could drop framed Snappy? It was only in a small range of Prometheus clients, a long time ago. |
We don't use it in Loki and it does make the code more complex, so I'm in for this, if you want. |
Signed-off-by: Cyril Tovena <[email protected]>
I also realized during writing tests, that FrameSnappy weight more. |
This was where framed Snappy was removed, before Prometheus 1.7: prometheus/prometheus#2696 |
I wonder if this is fine to remove ? I guess we won't support 1.7 then ? Is this fine ? I'm not totally sure. @gouthamve @pracucci any chance you might have a reason to keep this ? Do we expect all users above 1.7 Prometheus ? |
Yesterday we had a brief discussion in this Slack thread and LGTM from me and Goutham. As Bryan said:
As far as the CHANGELOG mention we're dropping support for Prometheus 1.7, LGTM. |
Signed-off-by: Cyril Tovena <[email protected]>
I would have prefer another PR. but ho well too late. |
Not clear why it's too late. I don't mind it being another PR. |
Signed-off-by: Cyril Tovena <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Changes make a lot of sense to me. I left few comments, but overall LGTM 👏
pkg/util/http.go
Outdated
sp := opentracing.SpanFromContext(ctx) | ||
if sp != nil { | ||
sp.LogFields(otlog.String("event", "util.ParseProtoRequest[start reading]")) | ||
defer func() { | ||
sp.LogFields(otlog.String("event", "util.ParseProtoRequest[unmarshal]"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this to ParseProtoReader()
for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually moved it here for clarity but still applied your feedback, it's more or less the same as you need to push down the span into all code branches.
My first implementation only start the usage of the span in the decompressRequest
.
From Review feedback from @pracucci Signed-off-by: Cyril Tovena <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, LGTM!
_, err = buf.ReadFrom(io.LimitReader(reader, int64(maxSize)+1)) | ||
body = buf.Bytes() | ||
case RawSnappy: | ||
_, err = buf.ReadFrom(reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a possible DoS attack? (I see it is what the old code did)
Seems we could use the same ReadFrom
in either case, then check compression
after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could fix it up later; I don't mind merging this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see the potential issue, not sure if there's a better option then using a limitReader in both case ?
Decoding the length still leave us open for hijacked/fake requests.
I made the change let me know what you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using LimitReader in all paths is correct, if the point is to stop someone blowing up the process.
Is 'result bigger than max' actually detected in the NoCompression case now?
Suggest just having one ReadFrom
call then check the len, before we get into a routine named decompress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because we read a bit more the defer in decompressRequest
will return the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Nit: Both decompressFromReader
and decompressFromBuffer
could benefit from using early returns.
pkg/util/http.go
Outdated
@@ -163,6 +115,88 @@ func ParseProtoReader(ctx context.Context, reader io.Reader, expectedSize, maxSi | |||
return nil | |||
} | |||
|
|||
func decompressRequest(ctx context.Context, reader io.Reader, expectedSize, maxSize int, compression CompressionType, sp opentracing.Span) (body []byte, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ctx
is unused, let's remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep this was leftover from previous span extraction code.
err = fmt.Errorf(messageSizeLargerErrFmt, len(body), maxSize) | ||
} | ||
}() | ||
if expectedSize > maxSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: There is no need to use defer in this function, it just makes code more tricky to follow.
return nil, fmt.Errorf(messageSizeLargerErrFmt, expectedSize, maxSize) | ||
} | ||
buffer, ok := tryBufferFromReader(reader) | ||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to check for non-nil buffer. It would be slightly more robust, as reader implementing interface { BytesBuffer() *bytes.Buffer }
can still return nil
buffer, which currently leads to panic.
pkg/util/http.go
Outdated
@@ -163,6 +115,88 @@ func ParseProtoReader(ctx context.Context, reader io.Reader, expectedSize, maxSi | |||
return nil | |||
} | |||
|
|||
func decompressRequest(ctx context.Context, reader io.Reader, expectedSize, maxSize int, compression CompressionType, sp opentracing.Span) (body []byte, err error) { | |||
defer func() { | |||
if len(body) > maxSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This will overwrite existing err, if one was already set. I can see some code paths how it can happen, but it's probably not important enough to fix.
Signed-off-by: Cyril Tovena <[email protected]>
pkg/util/http.go
Outdated
// Read from LimitReader with limit max+1. So if the underlying | ||
// reader is over limit, the result will be bigger than max. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this comment above, where io.LimitReader
is used, please.
Signed-off-by: Cyril Tovena <[email protected]>
This uses the last weaveworks/common, which include a fix for the content-type size to be set correctly allowing to allocate buffers correctly instead of let them grow naturally.
I've also implemented an alternative decompression from a
bytes.Buffer
which avoids allocating and reading a new buffer when using httpgrpc since the underlying reader is already abytes.Buffer
.I took the liberty to also improve the validation when using RawSnappy, I'm checking the length using the snappy header. This will avoid wasting CPU when the decompressed size is too big since we will know in advance.
I've added also missing tests that covers the old and new code.
I'm planning to use this in Loki, but it should also benefits Cortex, this method is used within the push and remote_read code.
Signed-off-by: Cyril Tovena [email protected]