Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Slow dispute coordinator #5359

Closed
eskimor opened this issue Apr 21, 2022 · 8 comments
Closed

Slow dispute coordinator #5359

eskimor opened this issue Apr 21, 2022 · 8 comments
Labels
T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.

Comments

@eskimor
Copy link
Member

eskimor commented Apr 21, 2022

As can be seen on ToF boards, the dispute coordinator is quite slow (>5s for handling some messages).

Investigation:

Amount of messages

On versi we do have a message rate in the ballpark of 60 to <100/s.

Interestingly, we handle them at approximately the same rate, which results in the channel almost never getting filled, better shown here.

So where do those long ToFs come from? Can they even be correct?

@sandreim
Copy link
Contributor

Interestingly, we handle them at approximately the same rate, which results in the channel almost never getting filled, better shown here.

I think we can't be really sure how often it gets fully filled, as the metrics are sampled once every 30s. The rate you mention is averaged over 1 hour, so it can really be bursty and we just never see that because of the same polling interval .

If we zoom in we can see it can go to 150/s (2m average).
Screenshot 2022-04-21 at 18 52 22

@eskimor
Copy link
Member Author

eskimor commented Apr 21, 2022

Yeah, I checked shorter polling intervals as well, in any case receives seem to keep up with sends, which is interesting given the rate and the ToF - adding some metrics.

@eskimor
Copy link
Member Author

eskimor commented Apr 25, 2022

Some performance debugging: #4793

@eskimor
Copy link
Member Author

eskimor commented Apr 25, 2022

Summary: db access is most likely the culprit, both read & write.

@eskimor
Copy link
Member Author

eskimor commented May 6, 2022

One strategy I was thinking of, was batch vote imports together. Especially in approval voting there are many votes for the same candidate. If we collect votes per candidate before doing the actual import for those collected votes, instead of each one individually, that could speed things up by a factor (almost 30).

@eskimor
Copy link
Member Author

eskimor commented May 7, 2022

Longer term, we should also see whether this preemptive vote import in the current form is really necessary. A node could for example only keep track of its own votes in the normal case and only distribute them to other nodes via dispute-distribution in case of an actual dispute. Like instead of doing nothing if already voted in backing/approval - just send your backing/approval vote and participate in dispute-distribution.

We might or might not keep the vote import from chain (chain scraping) as this should be rather efficient.

I specifically marked this longer term, as this is no change to be done lightly as there are some details and security implications that must be taken into account, but at the same time it would be nice if the disputes system won't cause any noticeable load if there are no disputes happening and definitely it should not be an obstacle in scaling.

@burdges
Copy link
Contributor

burdges commented May 7, 2022

I've forgotten how the pre-emptive import works, so can you remind me?

Are you sure batching votes together speed them up? It'd save 40% of the signature verification time if we pre-batch enough Schnorr signatures, so 45 μs becomes 30 μs, but not sure what the extra complexity costs.

We cannot afaik pre-batch disputes, which may or may not matter, but even batching valid votes costs latency, and maybe adds no-shows.

@eskimor
Copy link
Member Author

eskimor commented May 7, 2022

I've forgotten how the pre-emptive import works, so can you remind me?

Both backing and approval voting are sending all votes to the dispute coordinator all the time, so it can keep track of them on disk. This way it has backing and approval votes ready when an actual dispute happens for the candidate. This is quite costly, considering that most candidates will never get disputed.

Are you sure batching votes together speed them up? It'd save 40% of the signature verification time if we pre-batch enough Schnorr signatures, so 45 μs becomes 30 μs, but not sure what the extra complexity costs.

Yes batching up will be a noticeable speedup, but it is another kind of batching than you have in mind. Right now when importing votes in the dispute coordinator (which does not involve any signature checking) we are reading all existing votes, adding the one imported vote and then writing all the votes again - this is quite costly. If we batch up, it would be like: Read all votes, add 30, write back.

We cannot afaik pre-batch disputes, which may or may not matter, but even batching valid votes costs latency, and maybe adds no-shows.

The batching only concerns the import in the dispute coordinator, so it would have no effect on no-shows.

@eskimor eskimor closed this as completed Aug 16, 2022
@ordian ordian moved this to Progress in Parachains-core Aug 16, 2022
@ordian ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022
@ordian ordian moved this from In progress to Done in Parachains-core Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants