-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bolt2: add go_on
to quiescence spec
#14
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -507,7 +507,7 @@ channel by indicating that "SomeThing Fundamental is Underway". | |||||||
* [`channel_id`:`channel_id`] | ||||||||
* [`u8`:`initiator`] | ||||||||
|
||||||||
### Requirements | ||||||||
#### Requirements | ||||||||
|
||||||||
The sender of `stfu`: | ||||||||
- MUST NOT send `stfu` unless `option_quiesce` is negotiated. | ||||||||
|
@@ -532,7 +532,7 @@ The receiver of `stfu`: | |||||||
Upon disconnection: | ||||||||
- the channel is no longer considered quiescent. | ||||||||
|
||||||||
### Rationale | ||||||||
#### Rationale | ||||||||
|
||||||||
The normal use would be to cease sending updates, then wait for all | ||||||||
the current updates to be acknowledged by both peers, then start | ||||||||
|
@@ -545,6 +545,42 @@ considered to be the channel funder (the sender of `open_channel`). | |||||||
The quiescence effect is exactly the same as if one had replied to the | ||||||||
other. | ||||||||
|
||||||||
### `go_on` | ||||||||
|
||||||||
1. type: TODO (`go_on`) | ||||||||
2. data: | ||||||||
* [`channel_id`:`channel_id`] | ||||||||
* **NOTE FOR REVIEWERS** do we need to include initiator here? I think not. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so either, since the only valid flow is:
Any other flow would be an error, that should trigger a disconnection. |
||||||||
|
||||||||
#### Requirements | ||||||||
|
||||||||
The sender of `go_on`: | ||||||||
- MUST NOT send `go_on` for a channel that is NOT quiescent. | ||||||||
- MUST NOT send `go_on` if it is not the quiescence "initiator". | ||||||||
- MUST NOT send `go_on` twice without an interceding full quiescence | ||||||||
negotiation. | ||||||||
- MUST send `go_on` once any dependent protocol has completed. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this should be:
Suggested change
Say if dependent protocol A completes, we might not want to end the quiescent state then because maybe we want to begin another dependent protocol session that requires a quiescent state. |
||||||||
- MAY send `update_*` messages afterwards. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't you wait to receive your peer's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can tighten the requirement here, but I'm trying to figure out what problem it solves. This only really impacts the initiator so we can examine that. If I'm the initiator, and I send I think it's tempting to think that these messages are "commands" for the peer, but since the whole protocol is asynchronous we can't really think of things that way. I view There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's simply cleaner from a state machine design point of view (which is why we're adding a Eclair uses a strict finite state machine design for most of its components, which shows this design weirdness. But if you feel strongly about it I don't really mind, because your current requirements also allow us to wait for the responding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we can as also follow what the parent PR did and write, "must now consider this channel < whatever the opposite of quiescing is >" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah I can see what you're saying where if you go for the strict FSM design and the only valid exit of quiescence is send/receive. I have preferred not to think about it as a strict FSM due to the asynchronous nature of the rest of the protocol. While we have a total ordering of all messages on each half of the connection, we don't have a total ordering of all messages exchanged since there are many valid zips. As a result, as the responder, I have no way to guarantee you've received my I actually don't feel strongly about it! And int the process of responding to this comment I'm finding myself convinced by this line of reasoning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sure, but I have the guarantee that if you receive my future
Yes exactly, or the fact that if you resume sending
I don't feel too strongly about this either, as we still have the option of waiting for our peer's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll update as I do believe your suggestion makes it more likely that people will implement with predictable behavior. |
||||||||
|
||||||||
The receiver of `go_on`: | ||||||||
- if there is any incomplete downstream protocol state: | ||||||||
- SHOULD treat downstream protocol state as it would if the peer | ||||||||
disconnected. | ||||||||
Comment on lines
+567
to
+568
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this would only make it worse with regards to the inner protocol. This indicates a buggy peer, so it would be safer to simply disconnect (which also guarantees the inner protocol will be aborted on both sides), wouldn't it?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see the interpretation as a buggy peer, however I see it differently. I see it as a forced abort of the inner protocol. I don't think things need to be buggy in order for you to behave that way. Indeed, the very fact that you are suggesting disconnecting as a means of handling this implies that the only sensible way to handle this partially executed inner protocol is the same. So it depends on if we want to bless the process of aborting the inner protocol using this message or not. If you don't, then we can consider it a bug to arrive early and then it would make perfect sense to hang up. However, we could also just truncate the inner protocol state (which is what we do anyway) and then not have to reestablish the connection. I am sympathetic to using a disconnect here though as it may be the simplest path to resetting the inner protocol state and there may not be much to be gained by having this abort mechanism. I think either approach is well-defined, and I think the particulars of how the implementation was done may make a "live truncation" of inner protocol state varying levels of difficult. If you think it drive you nuts to do this in Eclair we can just ban it and accept the suggestion you lay out here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think that's correct. In this commit we're adding an explicit message to terminate the quiescence state to ensure that its state machine is well-defined independently of the inner protocols that may run once quiescent. By the same reasoning, every inner protocol must define its own clean termination mechanism as well. In the case described in this requirement, the sender of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, I don't think the semantics are different. If you send an out-of-turn
Any inner protocol like this will have volatile state and committed state. The spec should delineate between the two as it is required for any disconnect tolerant protocol. As such, you know exactly what state they'll be in and can just revert any volatile state, which is exactly what you do during reconnect. Sending an unexpected Does this deviate from Eclair's implementation strategy? I understand that it may be annoying to go this route and as such I'm willing to commit your suggestion, but I just want to make sure that we're on the same page that this is an effort/practicality discussion not a question of whether a solution exists. EDIT: One thing we can also do is add "MAY send There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think so: the node who receives an out-of-turn As a rule of thumb, whenever we detect buggy behavior from our peer, I believe we should disconnect, because we have no way of knowing how messed up their current state is, and restarting "from scratch" is our best bet at fixing it, since we have mechanisms in
That would work for me 👍, as eclair will disconnect in that case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TL;DR: I'll add the language around "MAY disconnect" and that should unify concerns. If you want to make a case for why this is YAGNI, I'm open to it. I haven't thought about the strength-to-weight ratio of such a feature. I was just trying to write as permissive of a spec as possible.
The receiver only needs to know that the protocol has been aborted.
Good catch. I'll admit I should probably add the same requirement to the sender to truncate volatile state to be clear.
Buggy implies that doing a forced abort can only result from incorrect implementation, and that is not a valid assumption in context right now. If you want to make the claim that we should never do a forced abort of some negotiation and that any protocol that supports an abort operation MUST specify one, then we can. However, I'm aware of no such requirement up until this point. I think we have to be a little more careful about what behavior we classify as "buggy". Given that the spec language I propose explicitly allows it, it -- by definition -- would not be buggy to send those messages out-of-turn. I'm operating with the understanding that a bug is when the software either doesn't implement the spec, or if the spec violates an assumed property. In this case, I'm not sure what assumed property would be violated. Because any property that would be violated by an out-of-turn Beyond that, I consider it a layer violation to make the validity of the messages defined in quiescence depend on the in-this-scope-unknowable state of the downstream protocol. So I would consider it incoherent to put the following statement into the spec:
...or something to that effect. This is because while quiescence is aware that there may exist downstream protocols, it shouldn't be allowed to know any of the particulars of what they are and how they work. I contrast this with the current language where we make a recommendation for how the downstream protocol handles this case since the awareness of protocol state flows from dependency to dependent AND NOT the other way around.
I wholeheartedly agree with this, modulo my philosophy on "buggy" behavior from above. |
||||||||
- MUST handle subsequent `update_*` messages from the peer. | ||||||||
- if it has not sent a `go_on` since it sent `stfu`: | ||||||||
- MUST reply with a `go_on` message. | ||||||||
|
||||||||
#### Rationale | ||||||||
|
||||||||
If we do not include a message that explicitly resumes update traffic then we | ||||||||
must depend on the the downstream protocol to mark when update traffic may | ||||||||
resume. This is problematic because it means we cannot implement this protocol | ||||||||
in isolation with all of its resumption semantics. | ||||||||
|
||||||||
Only the initiator of the quiescence is allowed to commence with `go_on` as | ||||||||
it is that node that can determine when the quiescence purpose has been | ||||||||
accomplished. | ||||||||
|
||||||||
## Channel Close | ||||||||
|
||||||||
Nodes can negotiate a mutual close of the connection, which unlike a | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope @cdecker can find us a better name for that one ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stands for "global operation is over now" 😁
But yeah I'm very open to improvement here 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
carry_on
🧐There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Channel Active Renegotiation Realized. Yes it's Over Now