-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undefined behavior for using WAMP features not agreed during feature negotiation #1605
Comments
Best would be to switch to a WAMP router that support CANCEL https://wamp-proto.org/wamp_latest_ietf.html#name-call-canceling
Autobahn must announce all features it supports https://wamp-proto.org/wamp_latest_ietf.html#section-4.1.1-10 and the router must reply with a subset of the client announced feature, a subset further limited by the features the router supports or wants to support for that client once features are negotiated, any use of a feature that wasn't agreed ... yeah, well: either should be a protocol error or should be ignored ... but which? tbh, I'm not sure we have appropriate spec text, and I'm not sure what Autobahn does in specific cases (like call canceling) :( marking this a "bug" for further investigation ... even though it is unclear what exactly is and should happen ... |
Thanks for the quick response. I don't have control of the server side but I can pass that information back to the third party as a suggestion. AFAIK this is written in C++ on minimal hardware so that might be why they're running a more minimal implementation and not handling things like call cancellation. Possibly also a concern around code footprint as well because I think we're the first external users of this API. Could you clarify what happens when an Autobahn client has a timeout via the call normally (eg the first block in my first post) but the server doesn't support it? Does the client just ignore it and carry on waiting? Or does it throw a protocol error like is happening with the asyncio.wait_for call? And does the client receive enough information that it can say what the protocol error was on? Right now I just get a bad protocol exception and no further information as to what caused it. That exception also gets thrown on the following call rather than getting thrown ahead of, or as part of, the |
by deault, autobahn will announce autobahn-python/autobahn/wamp/protocol.py Line 110 in a99ff5d
support for call cancelling autobahn-python/autobahn/wamp/role.py Line 288 in a99ff5d
but then use the feature without qualification (regardless of weather it was indeed negotiated as active) autobahn-python/autobahn/wamp/protocol.py Line 1730 in a99ff5d
the correct behavior for autobahn might be: if a feature wasn't actually negotiated as active during negotiation with the router, then throw a client-side exception without sending anything to the server when the feature is used in this case it would mean: you get an exception when calling "cancel" on the client side.
|
Well... The
In this case — the client side immediately receives the error, the wamp connection remains intact. |
the problem is: the use of the specific subfeature CANCEL happens when the original CALL feature already is in use IOW: the client cancel is a function on the deferred/future for the already issued call.
ok! that means: if we fail the client side of the call when the cancel fails, then what about the result we'll receive later?
so you argue for 1., right? |
autobahn(python) currently only implements router-side call canceling - not (pure or additional) client-side call canceling. callee side call canceling (the 3rd of the 3 possible sides) is callee (and router) business. (and that should be possible in autobahn autobahn-python/autobahn/wamp/protocol.py Line 1203 in a99ff5d
"cancel" is done in autobahn via autobahn-python/autobahn/wamp/protocol.py Line 1730 in a99ff5d
which uses which (for twisted) points to https://docs.twistedmatrix.com/en/stable/api/twisted.internet.defer.Deferred.html#cancel which doesn't take arguments. so for an autobahn implementation, this would mean expanding txaio to allow to eg do autobahn-python/autobahn/wamp/types.py Line 1007 in a99ff5d
neither is the case currently. |
O, thanks for description how it works under the hood
Well, I think both points are legitimate and just complement each other. The client wants to cancel the call. It just issues What if the client issues call canceling, and we return an error as a feature not supported — okay. What's then? Client still need to wait for the results? Or we will receive results sometime in the future and fail the whole connection because there is no related call? I think that's not so convenient for end user and there is no reason to fail the whole connection. |
Thanks for your thoughts - makes sense.
Yes, I agree, that what always what we aimed for in WAMP: make network and deployment details disappear from the application layer. It's just a glue between distributed app code.
I agree, raising "not supported" and "still need to wait" or "fail the whole connection" when the client explicitly triggered "cancel" and clearly wanted to go on with different tasks seems unhelpful. From app dev point of view. We should aim to do what is most helpful for the app developer. Trying to summarize the desired behavior:
I have marked the 3 remaining details to discuss for nailing everything IMO: (A): feature announcement in WAMP is hence to be understood purely in an "indicative", not "binding" manner. the peer is just announcing it is not actively opposing/denying a feature X. when a peer announces feature X, that doesn't necessarily mean it is supported all the time => if so, I guess we should add some text to the spec? (B): autobahn will raise an app code exception from a call that was canceled where the remote-call-cancel worked technically. should the exception raised when remote-call-cancel didn't work technically (but still is canceled client side) be eg a subclass of the former current one? => we'll need that for the code in autobahn (C): autobahn currently rigorously matches incoming messages with "expected return slots" like for an issued but still ongoing call. if no slot is there, a message received for that slots ID is a protocol violation. I definitely want to preserve that. the problem then is: a canceled call is removed from the slots, and when the peer then sends an reply, this will run into "protocol error" ... if the slots is kept open, for how long? otherwise it's a memory leak ... mmh |
I don't know enough about WAMP/Autobahn to really weigh in here on your design discussion but I was curious about why I receive the error on the follow up call and not when the previous call is cancelled? I'd expect to have something like this: sequenceDiagram
Client->>+Server: Call RPC A
Note over Server: Server blocks
Note over Client: Client times out the call of RPC A
Client->>+Server: Cancel RPC A
Server-->>+Client: Return bad protocol for unsupported cancellation
Note over Client: Client throws bad protocol error
but it feels more like this: sequenceDiagram
Client->>+Server: Call RPC A
Note over Server: Server blocks
Note over Client: Client times out the call of RPC A
Client->>+Server: Cancel RPC A
Client->>+Server: Call RPC B
Server-->>+Client: Return bad protocol for unsupported cancellation
Note over Client: Client throws bad protocol error
|
I have a client talking to a WAMP Basic Profile server (ran by a third party). This doesn't support call cancellation so I think a call made like this:
would ignore the timeout option and carry on running past the timeout?
I didn't actually notice the timeout option on the call method on my first pass so just wrapped it in
asyncio.wait_for
with a timeout like this:But that's throwing the occasional bad_protocol error. I didn't know what the exact issue is but our third party gave me access to the server logs which showed a 49 error code relating to the cancel call and it does always align with being thrown on the call after a 20s timed out call.
What's the best way to handle timeouts in this situation? I don't want to catch the bad_protocol exception as is in case the client or server changes in a way that this hides another issue. Can I at least see why the bad_protocol exception is being thrown in the exception somehow? I also don't want to allow the client to block indefinitely on the server because we previously saw consistent failures where the client just hangs forever waiting for a response from the server that never arrives.
Should Autobahn be detecting this lack of support for call cancellation and just silently dropping the task if cancelled in this way? Or throw another exception in some way that can be handled?
I noticed #1127 but while I think there's maybe some similarity in the reply in that issue I don't really see the similarity in the opening post.
The text was updated successfully, but these errors were encountered: