Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol message compression support #3287

Closed
wants to merge 1 commit into from

Conversation

gregtatcam
Copy link
Collaborator

@gregtatcam gregtatcam commented Mar 6, 2020

Peers negotiate compression via HTTP Headers X-Offer-Compression:lz4. If the client and the server agree on the compression/algorithm (only lz4 is currently supported) then the protocol messages between the peers are compressed when possible. Messages greater than 70 bytes and MANIFESTS, ENDPOINTS, TRANSACTION, GET_LEDGER, LEDGER_DATA, GET_OBJECT, and VALIDATORLIST messages are compressed. If the compressed message is larger than the uncompressed message then the uncompressed message is sent. Compression flag and the compression algorithm type are included in the payload header.

@codecov-io
Copy link

codecov-io commented Mar 6, 2020

Codecov Report

Merging #3287 into develop will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #3287   +/-   ##
========================================
  Coverage    70.46%   70.46%           
========================================
  Files          678      678           
  Lines        54462    54462           
========================================
  Hits         38375    38375           
  Misses       16087    16087           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51a0c5c...51a0c5c. Read the comment docs.

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job on this! I did a pass and left some comments. Mostly nits.

src/ripple/app/consensus/RCLConsensus.cpp Outdated Show resolved Hide resolved
src/ripple/app/misc/ValidatorList.h Outdated Show resolved Hide resolved
src/ripple/basics/CompressionAlgorithms.h Show resolved Hide resolved
src/ripple/basics/CompressionAlgorithms.h Outdated Show resolved Hide resolved
src/ripple/basics/CompressionAlgorithms.h Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/impl/ProtocolMessage.h Show resolved Hide resolved
src/ripple/overlay/impl/ProtocolMessage.h Outdated Show resolved Hide resolved
@gregtatcam gregtatcam requested a review from seelabs March 11, 2020 17:55
src/ripple/app/misc/impl/ValidatorSite.cpp Outdated Show resolved Hide resolved
*/
template<typename BufferFactory>
std::pair<void const*, std::size_t>
lz4fCompress(void const * in,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, but maybe a little too cool and it requires all sorts of magic further down. I think that it's better to avoid all of this and encode the size in an expanded binary header, and use a much simpler implementation here, similar to what the nodestore uses with lz4_compress and lz4_decompress.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the message is sent as one buffer but received into multiple buffers. If we use lz4_compress (block compression) on send then on receive the multiple buffers have to be copied into one buffer to pass to lz4_decompress which is undesirable. lz4 frame compression can compress the whole message and decompress one chunk at a time so the multiple buffers can be passed in as zero copy input stream. The only issue with lz4 frame decompression is that it may not consume all input bytes from a chunk. In this case the remaining bytes have to be copied and concatenated with the next input chunk. I have not seen this happen though in my test on 32 million of protocol messages captured from the mainnet. But this case still has to be coded. On another hand the captured sample data also shows that 1) 69% of the messages are received into one buffer; 2) 29% of the messages are received into two buffers; 3) 98% out of the 29% have the message size less than 500 bytes. So the copying might not be a substantial problem after all. Perhaps more data should be captured before we can decide on what compression mode is best to use - block or frame.

src/ripple/basics/CompressionAlgorithms.h Show resolved Hide resolved
return false;
switch(type)
{
case protocol::mtMANIFESTS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intention here (to avoid wasting CPU resources on messages that are unlikely to benefit from compression) but I am concerned that this approach is fragile: anytime we add a message type that can benefit from compression we need to remember to add support support for it here.

Also, consider a message like TMTransaction which can (depending on the transaction) be rather large and might benefit from compression.

I don't know that I have a better solution; just calling this out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Majority of the messages are not compressible (88% in the sample). The filtering is desirable because both CPU and memory (have to allocate compressed message buffer) will be used. Perhaps we could mitigate maintainability concern by adding compiler warning when a case is missing in the switch?

return (mBufferCompressed.data() + headerBytes);
});

double ratio = 1.0 - (double)compressedSize / (double)messageBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am allergic to floating-point stuff so just seeing the keyword makes my spidey sense tingle. I get what you're trying to do here: avoid expansion (and potentially enforce a minimum before forcing the other end to do work) however I think that we are better off not doing any of this.

Instead, I'd recommend using LZ4_compress_default which does some of this for you: if the result of the compressed data would have been larger than the uncompressed data, it returns 0 (i.e. "failed") and not worrying about a minimum compression ratio.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LZ4_compress_default may return non-zero and still compress into larger buffer. I changed to checking if the compressed size is less than uncompressed.

@@ -275,6 +275,8 @@ ConnectAttempt::makeRequest (bool crawl) -> request_type
m.insert ("Connection", "Upgrade");
m.insert ("Connect-As", "Peer");
m.insert ("Crawl", crawl ? "public" : "private");
if (compressionEnabled)
m.insert("Accept-Encoding", "lz4");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why you're using Accept-Encoding but since our handling of this header isn't quite standards-compliant, I'm not sure whether using a different header name is better. For example, X-Offer-Compression (both in the request and the response)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


namespace compression_algorithms {

inline void doThrow(const char *message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is invoked from multiple places and I don't know that they're all from within a try/catch block, or that the catch is appropriately scoped. We need to be careful when throwing to ensure that not only will the exception be caught but that continuing after the handler doesn't cause weirdness.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added try/catch in Compression.h module.

{
bool res = in.Next(&compressedChunk, &chunkSize);

if (!res && buffer.size() == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that !buffer.empty() reads better both here and on line 167.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

if (compressed == Compressed::Off)
return mBuffer;

if (!mCompressedRequested)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, but given the usage, this is probably better named: alreadyCompressed_.

Also I'm concerned about thread-safety here; I'm not sure it's a problem, but we need to triple-check that it's not. I'd almost feel better if we just did the compression during construction of the message, so that it never became an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to alreadyRequested_. The message may or may not be compressed.

@@ -0,0 +1,374 @@
//------------------------------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for the unit tests here.

src/ripple/overlay/Message.h Outdated Show resolved Hide resolved
src/ripple/overlay/Message.h Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/impl/ProtocolMessage.h Outdated Show resolved Hide resolved
src/ripple/overlay/Compression.h Outdated Show resolved Hide resolved
src/ripple/overlay/Message.h Outdated Show resolved Hide resolved
src/ripple/overlay/impl/Message.cpp Outdated Show resolved Hide resolved
src/ripple/overlay/Compression.h Outdated Show resolved Hide resolved
@@ -239,6 +239,8 @@ class ValidatorList
@param hashRouter HashRouter object which will determine which
peers not to send to

@param compressionEnabled is true if compression is enabled in rippled.cfg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this documentation got lost.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no longer needed. Will delete.

@@ -86,6 +86,7 @@ PeerImp::PeerImp (Application& app, id_t id,
, slot_ (slot)
, request_(std::move(request))
, headers_(request_)
, compressionEnabled_(headers_["X-Offer-Compression"] == "lz4" ? Compressed::On : Compressed::Off)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need && app_.config().COMPRESSION as was done in the other PeerImp constructor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is constructor for a peer making outbound connection. The header is not going to have the X-Offer-Compression header if the compression is not enabled so no need to check for COMPRESSION. The other constructor is for the inbound peer. We are not going to send compressed message to that peer if the compression is not enabled so we check for COMPRESSION.

seelabs
seelabs previously approved these changes Mar 25, 2020
Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice job on this!

@gregtatcam
Copy link
Collaborator Author

gregtatcam commented Mar 25, 2020 via email

HowardHinnant
HowardHinnant previously approved these changes Mar 26, 2020
* Peers negotiate compression via HTTP Header "X-Offer-Compression: lz4"
* Messages greater than 70 bytes and protocol type messages MANIFESTS,
  ENDPOINTS, TRANSACTION, GET_LEDGER, LEDGER_DATA, GET_OBJECT,
  and VALIDATORLIST are compressed
* If the compressed message is larger than the uncompressed message
  then the uncompressed message is sent
* Compression flag and the compression algorithm type are included
  in the message header
* Only LZ4 block compression is currently supported
@carlhua carlhua added the Documentation README changes, code comments, etc. label Mar 27, 2020
Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@HowardHinnant HowardHinnant self-requested a review March 27, 2020 18:58
@carlhua carlhua requested a review from nbougalis April 3, 2020 13:31
@carlhua carlhua added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Apr 3, 2020
This was referenced Apr 7, 2020
@manojsdoshi manojsdoshi mentioned this pull request Apr 8, 2020
intelliot referenced this pull request Aug 15, 2023
* Peers negotiate compression via HTTP Header "X-Offer-Compression: lz4"
* Messages greater than 70 bytes and protocol type messages MANIFESTS,
  ENDPOINTS, TRANSACTION, GET_LEDGER, LEDGER_DATA, GET_OBJECT,
  and VALIDATORLIST are compressed
* If the compressed message is larger than the uncompressed message
  then the uncompressed message is sent
* Compression flag and the compression algorithm type are included
  in the message header
* Only LZ4 block compression is currently supported
@intelliot
Copy link
Collaborator

Released in 1.6.0

intelliot pushed a commit that referenced this pull request Oct 6, 2023
P2P link compression is a feature added in 1.6.0 by #3287.

https://xrpl.org/enable-link-compression.html

If the default changes in the future - for example, as currently
proposed by #4387 - the comment will be updated at that time.

Fix #4656
sophiax851 pushed a commit to sophiax851/rippled that referenced this pull request Jun 12, 2024
P2P link compression is a feature added in 1.6.0 by XRPLF#3287.

https://xrpl.org/enable-link-compression.html

If the default changes in the future - for example, as currently
proposed by XRPLF#4387 - the comment will be updated at that time.

Fix XRPLF#4656
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation README changes, code comments, etc. Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants