P2P: Prioritize blocks, votes, and all messages over trxs #1090

heifner · 2025-01-02T15:00:08Z

Send all messages including blocks and votes over transactions. This prevents large trxs or large number of trxs from delaying block or vote propagation.
Includes a fix for buffer_queue.out_callback() called before clear_out_queue(). Long standing bug where the callbacks were never called because they were cleared before being called.
Do not post async_write to connection strand as the call is already on the connection strand.
Move serialization of votes off the main thread.

Resolves #1071

…. Do not clear_out_queue until callbacks are called, otherwise the callbacks are never called.

…st or pack if votes if syncing.

greg7mdp

Move serialization of votes off the main thread.

Where is that done?

greg7mdp · 2025-01-02T16:01:05Z

plugins/net_plugin/net_plugin.cpp

@@ -692,7 +698,7 @@ namespace eosio {
      mutable fc::mutex   _mtx;
      uint32_t            _write_queue_size GUARDED_BY(_mtx) {0};
      deque<queued_write> _write_queue      GUARDED_BY(_mtx);
-      deque<queued_write> _sync_write_queue GUARDED_BY(_mtx); // sync_write_queue will be sent first
+      deque<queued_write> _trx_write_queue GUARDED_BY(_mtx); // trx_write_queue will be sent last


indentation

greg7mdp · 2025-01-02T16:10:54Z

plugins/net_plugin/net_plugin.cpp

@@ -638,10 +645,10 @@ namespace eosio {
      // @param callback must not callback into queued_buffer
      bool add_write_queue( const std::shared_ptr<vector<char>>& buff,
                            std::function<void( boost::system::error_code, std::size_t )> callback,
-                            bool to_sync_queue ) {
+                            uint32_t net_message_which ) {


I feel this uint32_t net_message_which could be replaced with msg_type_t msg_type, where msg_type_t is defined as:

enum class msg_type_t : uint32_t { signed_block = fc::get_index<net_message, signed_block>(), packed_transaction = fc::get_index<net_message, packed_transaction>(), vote_message = fc::get_index<net_message, vote_message>() };

greg7mdp · 2025-01-02T16:34:00Z

plugins/net_plugin/net_plugin.cpp

+      boost::asio::async_write( *c->socket, bufs,
+         boost::asio::bind_executor( c->strand, [c, socket=c->socket]( boost::system::error_code ec, std::size_t w ) {


I think it could be slightly simplified to:

Suggested change

boost::asio::async_write( *c->socket, bufs,

boost::asio::bind_executor( c->strand, [c, socket=c->socket]( boost::system::error_code ec, std::size_t w ) {

boost::asio::async_write( *socket, bufs,

boost::asio::bind_executor( strand, [c, socket]( boost::system::error_code ec, std::size_t w ) {

plugins/net_plugin/net_plugin.cpp

linh2931 · 2025-01-02T16:39:56Z

plugins/net_plugin/net_plugin.cpp

         _write_queue_size = 0;
      }

-      void clear_out_queue() {
+      void clear_out_queue(boost::system::error_code ec, std::size_t w) {


What does parameter w mean?

Replaced with number_of_bytes_written

plugins/net_plugin/net_plugin.cpp

heifner · 2025-01-02T17:41:17Z

plugins/net_plugin/net_plugin.cpp

@@ -3928,15 +3936,18 @@ namespace eosio {
   }

   void net_plugin_impl::bcast_vote_message( uint32_t exclude_peer, const chain::vote_message_ptr& msg ) {
-      buffer_factory buff_factory;
-      auto send_buffer = buff_factory.get_send_buffer( *msg );


Move the serialization of vote_message off of main thread and move it down below into the net thread pool.

greg7mdp · 2025-01-02T18:50:29Z

plugins/net_plugin/net_plugin.cpp

-         my_impl->dispatcher.bcast_vote_msg( exclude_peer, std::move(msg) );
+      boost::asio::post( my_impl->thread_pool.get_executor(), [exclude_peer, msg]() mutable {
+            buffer_factory buff_factory;
+            auto send_buffer = buff_factory.get_send_buffer( *msg );


~~Accesses the shared_ptr send_buffer in buffer_factory from multiple threads without synchronization. It is not even clear to me what this variable is for?~~

Never mind I see that the buffer_factory is recreated each time.

I'm not really seeing why we have the caching in buffer factory (and pass a buffer factory to the posted lambdas) rather than just pass the serialized buffer to the posted lambdas?

In this particular case the vote_message_ptr is passed to the lambda so it can be serialized in the net thread pool instead of on the calling thread. The buffer_factory caching is not being used for votes, just provides the convenient interface.

I was looking at dispatch_manager::bcast_block() too. What is the benefit of using the buffer_factory there, instead of just calling get_send_buffer before the call to for_each_block_connection and capturing the send_buffer in the lambda instead of the buffer_factory?

For bcast_block the benefit is the node does not have to serialize the block if all peers have already received the block.

ericpassmore · 2025-01-02T22:53:23Z

Note:start
category: System Stability
component: P2P
summary: Update net-plugin to prioritize blocks, votes, and messages to ensure timeline arrival of critical information.
Note:end

heifner added 7 commits December 20, 2024 13:09

GH-1071 No need to post async_write; already on the connection strand…

f98f569

…. Do not clear_out_queue until callbacks are called, otherwise the callbacks are never called.

GH-1071 Log block_num that is done sending.

7aa6b82

GH-1071 Fix bad merge

5bde95a

GH-1071 Move packing of vote_message to net_plugin threads. Do not po…

b640816

…st or pack if votes if syncing.

GH-1071 Use net_message_which instead of bool to_sync_queue

589d6c2

GH-1071 Prioritize sending blocks, votes, and everything else over trxs.

fc2073b

GH-1071 Revert log level back to debug

61cd28b

heifner requested review from greg7mdp and linh2931 January 2, 2025 15:00

heifner added the OCI Work exclusive to OCI team label Jan 2, 2025

GH-1071 Cleanup queued_buffer out_queue callback handling.

d424891

greg7mdp reviewed Jan 2, 2025

View reviewed changes

linh2931 reviewed Jan 2, 2025

View reviewed changes

GH-1071 Use msg_type_t instead of a few which constants

6980451

heifner commented Jan 2, 2025

View reviewed changes

heifner added 2 commits January 2, 2025 11:52

GH-1071 Add comments and minor cleanup

6e55bda

GH-1071 Fix assert

a6e30ef

greg7mdp reviewed Jan 2, 2025

View reviewed changes

greg7mdp approved these changes Jan 2, 2025

View reviewed changes

linh2931 approved these changes Jan 2, 2025

View reviewed changes

heifner merged commit d89dc9f into main Jan 3, 2025
36 checks passed

heifner deleted the GH-1071-p2p-prioritize-blocks branch January 3, 2025 12:17

heifner mentioned this pull request Jan 14, 2025

P2P: Block propagation optimization #1099

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2P: Prioritize blocks, votes, and all messages over trxs #1090

P2P: Prioritize blocks, votes, and all messages over trxs #1090

heifner commented Jan 2, 2025 •

edited

Loading

greg7mdp left a comment

greg7mdp Jan 2, 2025

heifner Jan 2, 2025

greg7mdp Jan 2, 2025 •

edited

Loading

heifner Jan 2, 2025

greg7mdp Jan 2, 2025

heifner Jan 2, 2025

linh2931 Jan 2, 2025

heifner Jan 2, 2025

heifner Jan 2, 2025

heifner Jan 2, 2025

greg7mdp Jan 2, 2025 •

edited

Loading

greg7mdp Jan 2, 2025

heifner Jan 2, 2025

greg7mdp Jan 2, 2025

heifner Jan 2, 2025

ericpassmore commented Jan 2, 2025

		boost::asio::async_write( *c->socket, bufs,
		boost::asio::bind_executor( c->strand, [c, socket=c->socket]( boost::system::error_code ec, std::size_t w ) {

P2P: Prioritize blocks, votes, and all messages over trxs #1090

P2P: Prioritize blocks, votes, and all messages over trxs #1090

Conversation

heifner commented Jan 2, 2025 • edited Loading

greg7mdp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greg7mdp Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greg7mdp Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericpassmore commented Jan 2, 2025

heifner commented Jan 2, 2025 •

edited

Loading

greg7mdp Jan 2, 2025 •

edited

Loading

greg7mdp Jan 2, 2025 •

edited

Loading