You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All messages from a peer are processed by one receiver thread. These include consensus messages, mempool tx messages, evidence messages and blockchain messages. Among them, it is especially important to receive proposal messages(a kind of consensus messages) quickly, but if it takes a long time to process other messages in the front, it is possible to receive proposal messages late. Messages that are not dependent on each other need to be processed in individual threads asynchronously.
Problem Definition
In performance tests of 100 validators, round failures(progressing to next round after a consensus failure) were cited as the first bottleneck of performance. One of the reasons for the round failure is that some validator received a proposal too late. If a validator is late to enter the new round, he may be late to receive the proposal, but it has been found that he is late to receive the proposal even though he entered the new round early. In particular, it was witnessed that the validator did not receive the proposal immediately after a peer had sent the proposal, and received it several seconds later.
I investigated for several days the cause of the delay of several seconds between sending and receiving this proposal and found that the cause was due to the way the message receive routine works.
The receive routine is defined in MConnection.recvRoutine(). All messages from a peer are processed in this infinite loop.
Each message has a channel ID and is assigned and processed to the corresponding reactor according to that channel ID, all of which operate in one thread.
For example, if there are hundreds of tx messages in the receive buffer and then there are a proposal message after that, then the tx messages in the front must be processed to read the proposal. The problem is that the mempool reactor holds the lock to handle tx messages, where it can wait for hundreds of milliseconds to lock, so it may take a long time to process the tx messages.
In order to improve this problem, I suggested that each reactor process messages in a separate thread asynchronously.
To do this, the receiver thread only reads the message and puts the message into the channel each reactor has, and each reactor has the receive routine, so it reads the channel to process the message.
This will require four more threads per peer, but I don't think it'll be a big problem because it will be resting when messages aren't frequent.
Proposal
Create the receive routine thread for each reactor to process messages asynchronously.
For Admin Use
Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned
The text was updated successfully, but these errors were encountered:
Summary
All messages from a peer are processed by one receiver thread. These include consensus messages, mempool tx messages, evidence messages and blockchain messages. Among them, it is especially important to receive proposal messages(a kind of consensus messages) quickly, but if it takes a long time to process other messages in the front, it is possible to receive proposal messages late. Messages that are not dependent on each other need to be processed in individual threads asynchronously.
Problem Definition
In performance tests of 100 validators, round failures(progressing to next round after a consensus failure) were cited as the first bottleneck of performance. One of the reasons for the round failure is that some validator received a proposal too late. If a validator is late to enter the new round, he may be late to receive the proposal, but it has been found that he is late to receive the proposal even though he entered the new round early. In particular, it was witnessed that the validator did not receive the proposal immediately after a peer had sent the proposal, and received it several seconds later.
I investigated for several days the cause of the delay of several seconds between sending and receiving this proposal and found that the cause was due to the way the message receive routine works.
The receive routine is defined in
MConnection.recvRoutine()
. All messages from a peer are processed in this infinite loop.Each message has a channel ID and is assigned and processed to the corresponding reactor according to that channel ID, all of which operate in one thread.
For example, if there are hundreds of tx messages in the receive buffer and then there are a proposal message after that, then the tx messages in the front must be processed to read the proposal. The problem is that the mempool reactor holds the lock to handle tx messages, where it can wait for hundreds of milliseconds to lock, so it may take a long time to process the tx messages.
In order to improve this problem, I suggested that each reactor process messages in a separate thread asynchronously.
To do this, the receiver thread only reads the message and puts the message into the channel each reactor has, and each reactor has the receive routine, so it reads the channel to process the message.
This will require four more threads per peer, but I don't think it'll be a big problem because it will be resting when messages aren't frequent.
Proposal
Create the receive routine thread for each reactor to process messages asynchronously.
For Admin Use
The text was updated successfully, but these errors were encountered: