Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Handle cancel in ReleasingClientCall and rethrow the exception in start #1221

Merged
merged 6 commits into from
Jan 11, 2023

Conversation

mutianf
Copy link
Contributor

@mutianf mutianf commented Jan 6, 2023

We got a customer issue and the stacktrace didn't show us where the call was cancelled (b/263968439)

When trying to reproduce this error, we noticed some inconsistent behaviors when we cancel the stream in onStart()

client.readRowsCallable().call(Query.create("t"), new ResponseObserver<Row>() {
    @Override
    public void onStart(StreamController streamController) {
        streamController.cancel();
    }
    @Override
    public void onResponse(Row row) {
    }
    @Override
    public void onError(Throwable throwable) {
    }
    @Override
    public void onComplete() {
    }
});

When the above code is called, we got exception:

Caused by: java.util.concurrent.CancellationException: User cancelled stream
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable.onCancel(ServerStreamingAttemptCallable.java:309)

However, if we make another call before it:

Iterator<Row> stream = client.readRowsCallable().call(Query.create("t")).iterator();
client.readRowsCallable().call(Query.create("t"), new ResponseObserver<Row>() {
    @Override
    public void onStart(StreamController streamController) {
        streamController.cancel();
    }
   ...
}

We got exception:

java.lang.IllegalStateException: Not started
	at com.google.common.base.Preconditions.checkState(Preconditions.java:502)
	at io.grpc.internal.ClientCallImpl.sendMessageInternal(ClientCallImpl.java:511)
	at io.grpc.internal.ClientCallImpl.sendMessage(ClientCallImpl.java:504)

Which is very difficult to debug.

I think there are 2 problems in the ReleasingClientCall:

  1. It's not tracking cancellation status: After the call is cancelled, it continued to go into start() logic and eventually will call ClientCallImpl#startInternal https://github.com/grpc/grpc-java/blob/v1.51.1/core/src/main/java/io/grpc/internal/ClientCallImpl.java#L199, where the cancelCalled will be true and the state check will fail. In this case we won't see the actual stacktrace that cancelled the stream.
  2. It's not rethrowing the error in onStart(): The IllegalStateException thrown from ClientCallImpl#startInternal will be caught, GrpcDirectStreamController will move on to ClientCallImpl#sendMessage and eventually fail when checking if the stream is started https://github.com/grpc/grpc-java/blob/v1.51.1/core/src/main/java/io/grpc/internal/ClientCallImpl.java#L514.

@mutianf mutianf requested a review from a team as a code owner January 6, 2023 18:14
Copy link
Contributor

@igorbernstein2 igorbernstein2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm after nits

@igorbernstein2 igorbernstein2 added the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 6, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 6, 2023
@mutianf
Copy link
Contributor Author

mutianf commented Jan 6, 2023

@burkedavison Do you think it's ok to merge this PR? Do you have any other concerns? Thanks!

@burkedavison
Copy link
Member

@burkedavison Do you think it's ok to merge this PR? Do you have any other concerns? Thanks!

Hey @mutianf ,
LGTM, but I'd like to request approval from someone more familiar with gax-java development. @blakeli0 / @meltsufin ?

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

87.5% 87.5% Coverage
0.0% 0.0% Duplication

@igorbernstein2 igorbernstein2 added automerge Merge the pull request once unit tests and other checks pass. owlbot:run Add this label to trigger the Owlbot post processor. labels Jan 11, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 11, 2023
@igorbernstein2 igorbernstein2 merged commit 8a61249 into googleapis:main Jan 11, 2023
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Jan 11, 2023
@mutianf mutianf deleted the client-call branch January 11, 2023 15:50
lqiu96 pushed a commit that referenced this pull request Jan 11, 2023
…n start (#1221)

* fix: Handle cancel in ReleasingClientCall and rethrow the exception in start

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants