Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing race in FSTWriteStream #510

Merged
merged 7 commits into from
Dec 1, 2017
Merged

Fixing race in FSTWriteStream #510

merged 7 commits into from
Dec 1, 2017

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented Nov 30, 2017

This should address https://b.corp.google.com/issues/69701162

There are two races that this PR addresses:

  1. The one I found during local testing is that we call notifyStreamInterruptedWithError in FSTStream before we clear the stream delegate. It is then possible that FSTRemoteStore re-opens the write stream right away (synchronously). The assert in startWithDelegate that verifies that no delegate is set triggers in this case.

  2. The other one was root-caused by banging my head on the keyboard a couple times. I believe that it is possible to corrupt the stream state if the idle callback gets schedule while other operations are active. The following sequence should trigger this:

  • Stream goes idle, we schedule the idle callback for execution in 60s
  • 59.9 seconds later, the backend closes the stream for whatever reason. writeStreamFinished is invoked and scheduleshandleStreamClose on the dispatch queue.
  • The idle callback triggers and sets the stream state to Initial
  • handleStreamClose runs and asserts since the stream is no longer active.

I tried to find docs (or sources) for Apple's dispatch queue to verify how events are queued. I was hoping that the guarantees where similar to JS: setImmediate() always get priority over setTimeout(). This is however not the case - using a small test app I was able to verify that iOS does not prioritize existing dispatch_async over dispatch_after(now).

Fixes: #490

@schmidt-sebastian schmidt-sebastian changed the title Mrschmidt crash Fixing race in FSTWriteStream Nov 30, 2017
@schmidt-sebastian schmidt-sebastian force-pushed the mrschmidt-crash branch 2 times, most recently from e244bf2 to d8f624b Compare November 30, 2017 01:22
@schmidt-sebastian schmidt-sebastian force-pushed the mrschmidt-crash branch 3 times, most recently from 33ef982 to 56fadd6 Compare November 30, 2017 05:25
@@ -347,6 +347,10 @@ - (void)closeWithFinalState:(FSTStreamState)finalState error:(nullable NSError *
[self.workerDispatchQueue verifyIsCurrentQueue];
[self cancelIdleCheck];

FSTAssert(self.delegate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this move up to live alongside the other assertion, just to group all the sanity checking together?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -58,7 +58,10 @@ - (void)dispatchAsyncAllowingSameQueue:(void (^)(void))block {

- (void)dispatchAfterDelay:(NSTimeInterval)delay block:(void (^)(void))block {
dispatch_time_t delayNs = dispatch_time(DISPATCH_TIME_NOW, (int64_t)(delay * NSEC_PER_SEC));
dispatch_after(delayNs, self.queue, block);
dispatch_after(delayNs, self.queue, ^() {
// Make sure that we prioritize tasks that are already queued for immediate execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't entirely understand why this double dispatch actually fixes the problem you're describing. There's still a race condition even after this, but it's now just much smaller.

dispatch_after still executes on the same serial queue so this block cannot be called concurrently with any other executing block. It doesn't really matter which block executes the close, does it? We still need to deal with the fact after introducing idle timeouts there are now multiple paths to closure and so the handleStreamClose assertion can't be correct.

In short: either the server or the client can close the connection and the event handler that closes it could be either handleStreamClose or handleIdleCloseTimer. The latter already guards against the possibility that something already closed the stream--we just need to do something similar in handleStreamClose.

We need to allow for the possibility that between the point where we get a callback from gRPC indicating close and the point where our own dispatch queue actually handles the server close event handleIdleCloseTimer could have run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, you convinced me - there might be a slim possibility if the idle callback runs while another operation queues a depended block on the queue. So yes, it is better to just let handleStreamClose deal with this directly. I originally wanted to keep the state machine cleaner and avoid this extra check.

Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wilhuff
Copy link
Contributor

wilhuff commented Dec 1, 2017

Please add a CHANGELOG entry

@antonioallen
Copy link

Any idea when this fix will be added to the Production Firebase Distribution for iOS? Or which pod do I need to update from this repo to get the fix?

@wilhuff
Copy link
Contributor

wilhuff commented Dec 11, 2017

This will land in the next release, which we're starting to work on assembling now. I can't promise dates because unforeseen circumstances can cause us to miss them, but this should be soon.

If you want to try to pull the pod from source you'll need the pod described in FirebaseFirestore.podspec.

Something like this in your Podfile might get you there (though YMMV since I don't actually test this way):

pod 'FirebaseFirestore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'master'

Substitute :commit => '7c7da198eeda7392803e812b0062eb3e9119cade' if you want this fix but none of the other unreleased progress.

@wilhuff
Copy link
Contributor

wilhuff commented Dec 11, 2017

If you're going to mention specific commits, you probably also want the fix in #522 too: :commit => '4da1d25ac51ca7236227987b7320fd3cd0f17ab1'

@antonioallen
Copy link

Thanks for the reply! So I tried adding the specific commit and I keep getting this error (see image). pod 'FirebaseFirestore', :git => 'https://github.com:firebase/firebase-ios-sdk.git', :commit => '4da1d25ac51ca7236227987b7320fd3cd0f17ab1'
error

@antonioallen
Copy link

Do I need to do something else before I run pod install?

@morganchen12
Copy link
Contributor

@antonioallen, you're now in the realm of unsupported experimentation, so our ability to assist you will be limited.

To start debugging your issue I encourage you use pod install --verbose to get a more complete error message and then start googling around. The CocoaPods docs are fairly comprehensive and they have a long and extensive list of issues that people have run into in the past on their GitHub, both of which will likely be of use to you.

Good luck!

@antonioallen
Copy link

Ok, will do thanks! Also, here is a more complete error message (see image). If you have any ideas I would greatly appreciate it! Error: Port number ended with 'f'
error-verbose

@wilhuff
Copy link
Contributor

wilhuff commented Dec 12, 2017

Sorry, that was a typo in the URL. It should have been:

https://github.com/firebase/firebase-ios-sdk.git

I've updated the other comment too so you can see it in context.

Let us know if this works out. We believe our recent changes have made this kind of interoperation between open-source pods and the released ones possible, but this isn't yet a fully released (or supported) method yet.

@antonioallen
Copy link

Aha! That was a sneaky one. However, now the error is a bit more cryptic. Is there a way to do this manually?
error-verbose-2

@morganchen12
Copy link
Contributor

There is. CocoaPods' verbose option lists all the commands it runs before running them, so you can just copy paste those commands into your terminal.

@antonioallen
Copy link

If anyone else encounters this is issue make sure you are on CocoaPods 1.4.0 as specified in the documentation. You can install pre-release versions of Pods by running this command: sudo gem install cocoapods --pre

minafarid pushed a commit to minafarid/firebase-ios-sdk that referenced this pull request Jun 6, 2018
@firebase firebase locked and limited conversation to collaborators Nov 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Firestore crash: Can't handle server close in non-started state.
5 participants