-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for snappy compression in receivers #5575
Conversation
a84c5a3
to
8417002
Compare
Distributing series between receivers is currently done by sending uncompressed payloads which can lead to high inter-zone egress costs. This commit adds support for using snappy compression for sending data from one receiver to another. Signed-off-by: Filip Petkovski <[email protected]>
8417002
to
3c954c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for noticing this! 🚀
A tiny suggestion.
cmd/thanos/receive.go
Outdated
@@ -859,6 +868,9 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) { | |||
|
|||
cmd.Flag("receive.replica-header", "HTTP header specifying the replica number of a write request.").Default(receive.DefaultReplicaHeader).StringVar(&rc.replicaHeader) | |||
|
|||
compressionOptions := strings.Join([]string{compressionNone, snappy.Name}, ", ") | |||
cmd.Flag("receive.grpc-compression", "Compression algorithm to use for gRPC requests to other receivers. Must be one of: "+compressionOptions).Default(compressionNone).EnumVar(&rc.compression, compressionNone, snappy.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should enable it by default here? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. Enabling it by default makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest that for the v1 release, but even now it would make sense given the favourable tradeoff. Updated to default 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker but maybe we could use s2's (https://github.com/klauspost/compress) implementation of snappy encoding/decoding from the get go? 🤔
Makes sense. How should we go about doing this? We can:
|
I was thinking that (3) should be straightforward since those two libraries have identical/very similar APIs? 🤔 |
394c12b
to
212a5c7
Compare
Signed-off-by: Filip Petkovski <[email protected]>
212a5c7
to
2672762
Compare
d88affc
to
1c163cd
Compare
Got it. We should have a different implementation now with the different library. I suspect the test failure is a flake since I can't reproduce it locally. |
Signed-off-by: Filip Petkovski <[email protected]>
1c163cd
to
0226d44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it would be really cool to see if there's any difference in graphs @fpetkovski if you have any free time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think let's move forward with the current version based on https://github.com/klauspost/compress and then if needed in the future we can always switch back
Just additional note on this, since we were scratching our heads a bit until we figured it out - making this default markedly increased our memory consumption, it was noticeable especially on bigger instances (normally using ~35 GB memory) we've noticed that decompressing requests takes at least additional 2-3 GB per replica, so there is a trade off here. It took us a bit until we realized this is on by default in |
Distributing series between receivers is currently done by sending
uncompressed payloads which can lead to high inter-zone egress costs.
This commit adds support for using snappy compression for sending data
from one receiver to another.
Signed-off-by: Filip Petkovski [email protected]
Fixes #5572
Changes
Add support for snappy compression in receivers.
Verification
I rolled this out in one of our environments and saw similar results as the ones reported in the Cortex PR
Bandwidth across all receivers (
sum(rate(container_network_transmit_bytes_total))
) went from 300MiB/s to 130MiB/sCPU usage did go up by a few percent as expected
Results for a single receiver after rolling out snappy compression