-
Notifications
You must be signed in to change notification settings - Fork 130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -26,6 +26,7 @@ | |||
private static final int EXPECTED_DEFAULT_EPOCH_LENGTH = 30_000; | |||
private static final int EXPECTED_DEFAULT_BLOCK_PERIOD = 1; | |||
private static final int EXPECTED_DEFAULT_REQUEST_TIMEOUT = 1; | |||
private static final int EXPECTED_DEFAULT_MESSAGE_BUFFER_SIZE = 10_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these oddly named? I.e. what does "EXPECTED_" indicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are bit oddly named I guess. The "EXPECTED_" indicates this expected values for the assertion. These values are used to assert the default values are correct in the tests. It's existing test code so I just followed the existing pattern.
@@ -158,7 +159,8 @@ public TestContext build() { | |||
// Use a stubbed version of the multicaster, to prevent creating PeerConnections etc. | |||
final StubValidatorMulticaster multicaster = new StubValidatorMulticaster(); | |||
|
|||
final UniqueMessageMulticaster uniqueMulticaster = new UniqueMessageMulticaster(multicaster); | |||
final UniqueMessageMulticaster uniqueMulticaster = | |||
new UniqueMessageMulticaster(multicaster, MESSAGE_BUFFER_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should Multicaster be updated as part of this change? Or is that a separate problem?
Is the size of the multicaster buffer related to the incoming event queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multicaster should be updated to have a limited size buffer. Would be better to have a seperate config option though.
@@ -64,7 +64,7 @@ | |||
private final BlockTimer blockTimer; | |||
private final IbftMessageTransmitter transmitter; | |||
private final MessageFactory messageFactory; | |||
private final Map<Integer, RoundState> futureRoundStateBuffer = Maps.newHashMap(); | |||
private final Map<Integer, RoundState> futureRoundStateBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this related to the incoming event queue size?
@@ -39,4 +41,12 @@ public int getBlockPeriodSeconds() { | |||
public int getRequestTimeoutSeconds() { | |||
return ibftConfigRoot.getInteger("requesttimeoutseconds", DEFAULT_ROUND_EXPIRY_SECONDS); | |||
} | |||
|
|||
public int getSeenMessageQueueSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/getSeenMessageQueueSize/getSeenMessageQueueLimit
QueueSize could be mistaken as being the current queue size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like limit better. Still don't like these config names though regardless of size or limit suffix. So if have any suggestions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SeenMessages
is less obvious, I'll have to think on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use size
when you mean limit
. Consider common use of size
in places such as the Collection classes
Intent was to be the max size |
return ibftConfigRoot.getInteger("seenmessagequeuesize", DEFAULT_SEEN_MESSAGES_QUEUE_SIZE); | ||
} | ||
|
||
public int getEventQueueSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more appropriate name for this variable would be messageQueueLimit
, as it is pretty much the standard use for a MQ. This should make sense / easy inference by enterprise users of the product.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. changed eventQueueSize -> messageQueueLimit and seenMessageQueueSize -> gossipedHistoryLimit
@@ -21,6 +21,8 @@ | |||
private static final long DEFAULT_EPOCH_LENGTH = 30_000; | |||
private static final int DEFAULT_BLOCK_PERIOD_SECONDS = 1; | |||
private static final int DEFAULT_ROUND_EXPIRY_SECONDS = 1; | |||
private static final int DEFAULT_GOSSIPED_HISTORY_LIMIT = 10_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this FEELS like a huge number ... in a good system, should this not be about ... 100?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10,000 is probably large than it needs to be. It will however only take a small amount of memory as we are only storing the hashes. So I don't think we need to be too exact and can be a little generous.
Perhaps 1000 is a more appropriate default? I think this would handle to tracking the message hashes of few rounds. And a size of 100 would only just cover the total number of messages for the current round if we going to use 20 validators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this may be disingenuous - in an ideal situation we only need to buffer enough to prevent re-gossiping gossiped packets.
Which probably means the last 'set' of packets i.e. once we've received a few Prepares, we SHOULDN'T receive any more gossiped proposals (and prior round msgs are discarded out of hand) ... 100 may be plenty....
Happy to go as high as 1000 to handle spurious behaviours from peers .... but I think we should really understand how this works in both normal and byzantine circumstances before we throw in a safe number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth thinking about if there's some metrics you could capture to help find the right value for this setting (even if you need to add some expensive ones temporarily). ie every time you find a message you've already gossiped, record how many messages are recorded after it in your history. That will likely have to be a temporary thing because you'll need to use a less efficient structure to hold the history (worst case a list you iterate from most recent to least recent), but it would give you some concrete data to drive decisions from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had some further discussion and settled on 1000 for now as this will cover the current height with several rounds.
Would be great to get some metrics on this. And would also allow users to tune this setting if they wanted to.
private static final Logger LOG = LogManager.getLogger(); | ||
private final int maxQueueSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to using queueLimit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still there. I missed updating the existing field name in the IbftEventQueue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to make it consistent with the new config names
PR description
Add a hard cap to the number unprocessed messages in the event queue and the seen messages tracked.
Fixed Issue(s)