Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Racy cancel from transport does not always cancel the message body #2369

Merged
merged 11 commits into from
Oct 4, 2022

Conversation

idelpivnitskiy
Copy link
Member

@idelpivnitskiy idelpivnitskiy commented Sep 23, 2022

Motivation:

When cancel from the server's transport races with HTTP message completion, we can receive cancel after Single<HttpResponseMetaData> completes but before we subscribe to the messageBody publisher. In this case, we won't subscribe to the messageBody because we use concat operator. This does not allow BeforeFinallyHttpOperator to observe a terminal event. The race is possible when offloading is enabled. cancel comes from an IO thread, HTTP meta-data completes on the offloaded thread.

Modifications:

  • For the server-side control flow, use the recently added concatPropagateCancel operator to guarantee propagation of the cancel for the entire flat HTTP stream;

Result:

HttpLifecycleObserverTest.testClientCancelsRequestBeforeResponse() does not race anymore.

Motivation:

When `cancel` from the transport races with HTTP message completion, we
can receive `cancel` after `Single<HttpMetaData>` completes but before
we subscribed to the `messageBody` publisher. In this case, we won't
subscribe to the `messageBody` because we use `concat` operator. This
does not allow `BeforeFinallyHttpOperator` to observe a terminal event.
Race is possible when offloading is enabled. `cancel` comes from an IO
thread, HTTP meta-data completes on offloaded thread.

Modifications:

- For every case when we concat a completed source with the message body
add `afterFinally` to guarantee propagation of the cancel for the entire
flat HTTP stream;

Result:

`HttpLifecycleObserverTest.testClientCancelsRequestBeforeResponse()`
does not race anymore.
@idelpivnitskiy
Copy link
Member Author

@Scottmitch rebased after your changes, ready for review

Copy link
Member

@Scottmitch Scottmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the test failure related at all?

https://github.com/apple/servicetalk/runs/8686381887

HttpTransportObserverTest.serverHandlerError(HttpProtocol) [1] protocol=HTTP_1

Wanted but not invoked:
serverWriteObserver.itemFlushed();
-> at io.servicetalk.http.netty.HttpTransportObserverTest.testRequestResponse(HttpTransportObserverTest.java:243)

However, there were exactly 4 interactions with this mock:
serverWriteObserver.requestedToWrite(64L);
-> at io.servicetalk.transport.api.CatchAllTransportObserver$CatchAllWriteObserver.lambda$requestedToWrite$0(CatchAllTransportObserver.java:255)

serverWriteObserver.itemReceived(
    HTTP/1.1 500 Internal Server Error
);
-> at io.servicetalk.transport.api.CatchAllTransportObserver$CatchAllWriteObserver.lambda$itemReceived$1(CatchAllTransportObserver.java:260)

serverWriteObserver.onFlushRequest();
-> at io.servicetalk.transport.api.CatchAllTransportObserver.safeReport(CatchAllTransportObserver.java:306)

serverWriteObserver.itemWritten(
    HTTP/1.1 500 Internal Server Error
);
-> at io.servicetalk.transport.api.CatchAllTransportObserver$CatchAllWriteObserver.lambda$itemWritten$2(CatchAllTransportObserver.java:270)


	at app//io.servicetalk.http.netty.HttpTransportObserverTest.testRequestResponse(HttpTransportObserverTest.java:243)
	at app//io.servicetalk.http.netty.HttpTransportObserverTest.serverHandlerError(HttpTransportObserverTest.java:213)
	

// with non-replayable messageBody to retry.
if (emptyMessageBody(request, messageBody)) {
flatRequest = flatEmptyMessage(connectionContext().protocol(), request, messageBody,
/* propagateCancel */ false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: /* propagateCancel */ -> do we need this? I think it's safe to assume folks will be able to follow the code in IDE or preferred method to understand parameter meanings (e.g. expectation for java code). The more interesting part is explaining why (e.g. client doesn't need it, but server does).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't hurt to have it either. I frequently read code on GitHub without IDE and share a direct link to interesting places.

@idelpivnitskiy
Copy link
Member Author

The test failyre is unrelated, it's known flaky test #2040

@idelpivnitskiy idelpivnitskiy merged commit abc0d75 into apple:main Oct 4, 2022
@idelpivnitskiy idelpivnitskiy deleted the cancel branch October 4, 2022 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants