-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - syncv2: pairwise set reconciliation #6350
Conversation
This implements pairwise set reconciliation protocol based on "Range-Based Set Reconciliation" by Aljoscha Meyer https://arxiv.org/pdf/2212.13567.pdf
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #6350 +/- ##
=========================================
- Coverage 81.7% 81.6% -0.1%
=========================================
Files 313 328 +15
Lines 34972 36317 +1345
=========================================
+ Hits 28573 29652 +1079
- Misses 4557 4742 +185
- Partials 1842 1923 +81 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
Also, add a test that verifies that not adding received items to the set immediately doesn't affect set reconciliation operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @ivan4th. You've spent a great deal of effort on this and it is noticeable. I spent a couple of hours trying to review this and have put in some comments. In general a few issues stand out:
- there's a lot of panics in the code which I'm not sure if they're planned to be fixed or left as is. If left as-is, they could pose as attack surfaces by using crafted messages. Leaving so many of them at this early stage could mean a lot of signature changes and all sorts of edge cases occuring due to (so-far) unhandled errors becoming handled.
- there's a few significant files that have no noteworthy documentation.
- to me, the protocol stands out in its complexity - a p2p protocol that has more than a few messages becomes difficult to reason about in various terms. it would be useful, at least, to understand the different messages, their relationships and possible state transitions. a simple sequence diagram(s) will do.
- while some general concepts of the design of the protocol are described in the
doc.go
file, there's no mentioning of how the different components interact. i believe that this would be useful for the review process, and for posterity - due to complexity and importance of some of the undocumented files, i could not review the whole thing (as i'd just be rubber-stamping this). once these issues are discussed/addressed, i could continue this.
Thanks!
return sb.String() | ||
} | ||
|
||
type sender struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why does a message need to be aware of a sender? In the context of this file, I'd expect it to handle stuff related to messages only - instantiation/(de)serialization, why "mix joy with joy" and blend in stuff related to message sending?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sender
is a convenience message constructor. As every message, once constructed, is immediately sent and nothing else is ever done with it, it's no point in separating send call from that constructor.
Might try to think about better name for it...
I could also remove sender
completely, but that will make the related code somewhat less readable IMO, in particular due to type casting and "compact hash" construction.
The panics used in the PR are assertions.
As far as I understand, that primarily concerns
Will add diagrams as part of a separate
Will try to make things more clear in the comments/doc.
As noted above, I described some of the undocumented code briefly (implementations of documented |
So here I partially agree. If an attacker can craft a message that can make a remote node panic - we have a serious problem (as the entire network can just be shut-down by any half-baked script-kiddy). Now, since I am not particularly familiar with the call-stack depth in which these dependencies will be used, or the likelihood of an attacker-crafted data reaching a point in the code where there's absolutely no chance of the code panicking (simply because it is already checked/type translated in a way which mitigates the risk completely), then consider my comment as retracted. Generally, inconsistent data received from one peer should never cause a node to panic (unless we really screwed up, which sometimes happens - nil pointer exceptions and out of bound exceptions, etc, however, they are almost unlikely and are invoked by the std lib). |
82d74a8
to
2e8e59c
Compare
2e8e59c
to
c22a7c4
Compare
Use black box testing Add mising KeyBytes and Fingerprint tests Document dumbSet and related stuff Document wireConduit
Of course the code should never panic if the user-provided data is invalid, that would be too amateur of an approach ;) |
Protocol description added: https://github.com/spacemeshos/go-spacemesh/blob/syncv2/pairwise/dev-docs/sync2-set-reconciliation.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a few more comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your patience :)
bors merge |
## Motivation Existing sync is causing a lot of resource usage, including memory consumption and network traffic, and can't replace gossip which can also be unreliable at times.
bors cancel |
Canceled. |
bf3d6da
to
8c50bd4
Compare
gzipped excalidraw files in the docs before merging to reduce LoC footprint |
bors merge |
## Motivation Existing sync is causing a lot of resource usage, including memory consumption and network traffic, and can't replace gossip which can also be unreliable at times.
Pull request successfully merged into develop. Build succeeded: |
Motivation
Existing sync is causing a lot of resource usage, including memory consumption and network traffic, and can't replace gossip which can also be unreliable at times.
Description
This implements pairwise set reconciliation protocol based on "Range-Based Set Reconciliation" by Aljoscha Meyer https://arxiv.org/pdf/2212.13567.pdf
This supersedes a part of #5769 and does not contain the optimized data structures for range fingerprinting and does not implement multipeer reconciliation algorithm.
One part of this change that may see further improvement: the reconciled IDs themselves are sent as "chunks" contained in SyncMessages. This might be suboptimal but makes implementation of the interactive reconciliation protocol somewhat simpler. It will be replaced by non-chunked streaming after some optimizations of the range fingerprinting data structures, so that they can be copied in O(1) time and the ID stream can be consumed immediately not waiting for the current messages to be processed.
See package documentation in
doc.go
and fundamental sync types insync2/types/types.go
for the description of the protocol and its fundamental primitives.Prerequisite: spacemeshos/go-scale#98