Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash during Channel with subscription is destroying #328

Merged

Conversation

slavabulgakov
Copy link
Contributor

Fix for #327

The goal was to fix crash during Channel (which has subscription to connectivity state changes) is destroying

@MrMage
Copy link
Collaborator

MrMage commented Nov 5, 2018

Thank you! Would it be possible to add a test case for this as well that would fail without your fixes?

(A proper review will follow, I'm on the go right now.)

@slavabulgakov
Copy link
Contributor Author

slavabulgakov commented Nov 6, 2018

I've added a test case in ChannelCrashTests.swift. You need just comment my changes in Channel.swift to reproduce the crash. Thank you for quick respond!

@slavabulgakov slavabulgakov force-pushed the Fix-crash-during-Channel-destroying branch from b5598b8 to 557aa4e Compare November 6, 2018 18:56
Copy link
Collaborator

@MrMage MrMage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these fixes! My suspicion is that this crash could still happen in some rare conditions due to a race condition where the channel is shut down in between the hasBeenShutdown check and the cgrpc_channel_check_connectivity_state call. I.e. if the channel is closed from a different thread while the loop is somewhere between line 166 and 172, this could still crash.

I don't have a good solution for this off the top of my head, but at least this should also happen much more rarely with your fixes.

import SwiftGRPC

class ChannelCrashProvider: ChannelCrash_ChannelCrashProvider {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please remove this newline.

import Foundation
import SwiftGRPC

class ChannelCrashProvider: ChannelCrash_ChannelCrashProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you pull this entire class into ChannelCrashTests.swift?

@@ -0,0 +1,108 @@
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use the existing generated code for EchoService instead, and move ChannelCrashTests.swift into the SwiftGRPCTests folder?

@@ -151,11 +152,20 @@ private extension Channel {

spinloopThreadQueue.async {
while true {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove this newline

Copy link
Collaborator

@MrMage MrMage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I got distracted by all the new files and forgot to review the test case itself :-(

}

class ChannelCrashTests: XCTestCase {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for noticing this only now, but could you please make this a subclass of https://github.com/grpc/grpc-swift/blob/master/Tests/SwiftGRPCTests/BasicEchoTestCase.swift? That should spare you from all the client/server setup work. Also, please remove this extra newline.

}

private func receive(call: Echo_EchoExpandCall) {
try? call.receive { result in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use try! and move this into the test method itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest to move call "try! call.receive {...}" inside "testChannelCrash" method?
If yes then I cannot recursively call method "private func receive(call: Echo_EchoExpandCall)" as I do it now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you instead use the blocking, synchronous variant of receive? Or leave out the receive part completely, given that it should not be relevant to the crash, anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it is not relevant to the crash. I will remove it.

var server: ServiceServer!
var client: Echo_EchoServiceClient?

func testChannelCrash() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Rename to testDanglingConnectivityObserversDontCrash?

func expand(request: Echo_EchoRequest, session: Echo_EchoExpandSession) throws -> ServerStatus? {
let parts = request.text.components(separatedBy: " ")
for (i, part) in parts.enumerated() {
usleep(500000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Thread.sleep instead?

receive(call: call)

DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(1)) {
self.client = nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment that this is to ensure that the client gets deallocated. Also, can all the timeouts be shortened quite a bit, to avoid increasing the test runtime so much?

@@ -0,0 +1,83 @@
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to replace it with your block "Copyright 2018, gRPC Authors All ..." which used in other files?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be best, thanks for spotting!

@@ -0,0 +1,34 @@
// Copyright (c) 2015, Google Inc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete this file.

@@ -178,6 +187,7 @@ private extension Channel {
}

func shutdown() {
hasBeenShutdown = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, please synchronize accesses to this variable.

Copy link
Collaborator

@MrMage MrMage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for all these changes! We are almost there, just three small changes left.

@testable import SwiftGRPC

class ChannelCrashTests: BasicEchoTestCase {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove this newline.

@@ -151,11 +153,19 @@ private extension Channel {

spinloopThreadQueue.async {
while true {
guard !self.hasBeenShutdown else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also wrap these two checks in a channelMutex.synchronize call, e.g.

guard (channelMutex.synchronize { !self.hasBeenShutdown } ...

@@ -132,6 +132,8 @@ private extension Channel {
private let underlyingCompletionQueue: UnsafeMutableRawPointer
private let callback: (ConnectivityState) -> Void
private var lastState: ConnectivityState
private var hasBeenShutdown = false
private let channelMutex: Mutex = Mutex()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: stateMutex

Copy link
Collaborator

@MrMage MrMage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, Slava!

@MrMage MrMage closed this Nov 7, 2018
@MrMage MrMage reopened this Nov 7, 2018
@MrMage MrMage merged commit 2fd06dc into grpc:master Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants