-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gcoap: fixes around empty ACKs #18429
Conversation
if (coap_get_type(&pdu) == COAP_TYPE_CON) { | ||
/* we might run into this if an ACK to a sender got lost | ||
* => best to ACK (again) */ | ||
messagelayer_emptyresponse_type = COAP_TYPE_ACK; | ||
DEBUG("gcoap: Answering unknown CON response with ACK to " | ||
"shut up sender\n"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A reset should be send:
In case a message carrying a response is unexpected (the client is
not waiting for a response from the identified endpoint, at the
endpoint addressed, and/or with the given token), the response is
rejected (Sections 4.2 and 4.3).Implementation Note: A client that receives a response in a CON
message may want to clean up the message state right after sending
the ACK. If that ACK is lost and the server retransmits the CON,
the client may no longer have any state to which to correlate this
response, making the retransmission an unexpected message; the
client will likely send a Reset message so it does not receive any
more retransmissions. This behavior is normal and not an
indication of an error. (Clients that are not aggressively
optimized in their state memory usage will still have message
state that will identify the second CON as a retransmission.
Clients that actually expect more messages from the server
[OBSERVE] will have to keep state in any case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, an RST is the intended behavior. (I thought we already did this, at least that's what I remember of last time touching the empty-ACK code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, an RST is the intended behavior. (I thought we already did this, at least that's what I remember of last time touching the empty-ACK code).
I saw a server keep sending transmissions even though the CON was already ACK'd. I did not see any RST messages.
If the RFC says RST should be used, I use RST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 2cd7857. From a quick search through the gcoap code, I am not sure, if RST is handled at all oO. The only other occurence of COAP_TYPE_RST
is for the handling of "CoAP pings".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just failed to pay attention to the "else /* no memo */" branch, thanks for the fix.
1636723
to
df29505
Compare
Ad testing: For aiocoap I do have run a few test cases with a "non-CoAP" peer that sends crafted packets that no CoAP stack would ever send. Maybe we should have something like that, but demanding it here would be excessive. Did you do any tests? I don't have any idea on how to test this easily. Do we have any code that sets a NULL response handler at all, currently? Ad 2: Expiring the memo has the side effect of sending an RST rather than an ACK on the actual response. That's not completely unreasonable and can also happen regularly (eg. when the ACK gets lost, the response is handled, and consequently the second response CON hits case 1), even when the server didn't retransmit (packet could have been duplicated), but it's an information sent to the server which we should be aware of. I think it's fine. |
I ran this in my experiments, however, only with the ACK variant in OP. I can do some runs with the RST variant tomorrow. |
|
Fixed: Two problems still existed: expiring a memo does not delete the timer => doing that now, also I implementing the handling of RST message, by deleting the timeout and expiring the memo. |
1895996
to
f11c9e8
Compare
And squashed. |
To test I ran 2 clients that speak via a proxy with #18386 to a server with ~5 requests per second. This causes enough loss to both create scenarios (especially with POST request which are not cached or when the cache is not present in the proxy) where 1. the request may get lost en route between proxy and server (i.e. the server empty-ACK'd the request to the client and uses CON responses to send the server response back to the client) and 2. the ACK for the CON response by the proxy gets lost. Before the proxy will just continue sending CON responses until it times out, if the ACK gets lost. With these changes this does not happen anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. Code change looks good and I trust your testing.
Backport provided in #18436 |
Contribution description
This adds two fixes around the handling and sending of empty ACKs:
memo
that matches this response.The 1. fix may be disputable. In my opinion it makes more sense to send a few short empty ACKs more than having some server sending a potentially large CON response multiple times just because the server missed a previous ACK, but I can remove this if this is for some reason not a good idea.
Testing procedure
Not sure how to test this... I saw it while running some experiments with #18386.
Issues/PRs references
None, but improves behavior in #18386.