Fixing race in FSTWriteStream #510

schmidt-sebastian · 2017-11-30T01:17:08Z

This should address https://b.corp.google.com/issues/69701162

There are two races that this PR addresses:

The one I found during local testing is that we call notifyStreamInterruptedWithError in FSTStream before we clear the stream delegate. It is then possible that FSTRemoteStore re-opens the write stream right away (synchronously). The assert in startWithDelegate that verifies that no delegate is set triggers in this case.
The other one was root-caused by banging my head on the keyboard a couple times. I believe that it is possible to corrupt the stream state if the idle callback gets schedule while other operations are active. The following sequence should trigger this:

Stream goes idle, we schedule the idle callback for execution in 60s
59.9 seconds later, the backend closes the stream for whatever reason. writeStreamFinished is invoked and scheduleshandleStreamClose on the dispatch queue.
The idle callback triggers and sets the stream state to Initial
handleStreamClose runs and asserts since the stream is no longer active.

I tried to find docs (or sources) for Apple's dispatch queue to verify how events are queued. I was hoping that the guarantees where similar to JS: setImmediate() always get priority over setTimeout(). This is however not the case - using a small test app I was able to verify that iOS does not prioritize existing dispatch_async over dispatch_after(now).

Fixes: #490

wilhuff · 2017-11-30T18:29:15Z

Firestore/Source/Remote/FSTStream.m

@@ -347,6 +347,10 @@ - (void)closeWithFinalState:(FSTStreamState)finalState error:(nullable NSError *
  [self.workerDispatchQueue verifyIsCurrentQueue];
  [self cancelIdleCheck];

+  FSTAssert(self.delegate,


Could this move up to live alongside the other assertion, just to group all the sanity checking together?

wilhuff · 2017-11-30T18:42:15Z

Firestore/Source/Util/FSTDispatchQueue.m

@@ -58,7 +58,10 @@ - (void)dispatchAsyncAllowingSameQueue:(void (^)(void))block {

 - (void)dispatchAfterDelay:(NSTimeInterval)delay block:(void (^)(void))block {
  dispatch_time_t delayNs = dispatch_time(DISPATCH_TIME_NOW, (int64_t)(delay * NSEC_PER_SEC));
-  dispatch_after(delayNs, self.queue, block);
+  dispatch_after(delayNs, self.queue, ^() {
+    // Make sure that we prioritize tasks that are already queued for immediate execution.


I don't entirely understand why this double dispatch actually fixes the problem you're describing. There's still a race condition even after this, but it's now just much smaller.

dispatch_after still executes on the same serial queue so this block cannot be called concurrently with any other executing block. It doesn't really matter which block executes the close, does it? We still need to deal with the fact after introducing idle timeouts there are now multiple paths to closure and so the handleStreamClose assertion can't be correct.

In short: either the server or the client can close the connection and the event handler that closes it could be either handleStreamClose or handleIdleCloseTimer. The latter already guards against the possibility that something already closed the stream--we just need to do something similar in handleStreamClose.

We need to allow for the possibility that between the point where we get a callback from gRPC indicating close and the point where our own dispatch queue actually handles the server close event handleIdleCloseTimer could have run.

Hm, you convinced me - there might be a slim possibility if the idle callback runs while another operation queues a depended block on the queue. So yes, it is better to just let handleStreamClose deal with this directly. I originally wanted to keep the state machine cleaner and avoid this extra check.

wilhuff

LGTM

wilhuff · 2017-12-01T18:24:11Z

Please add a CHANGELOG entry

antonioallen · 2017-12-11T20:49:09Z

Any idea when this fix will be added to the Production Firebase Distribution for iOS? Or which pod do I need to update from this repo to get the fix?

wilhuff · 2017-12-11T21:57:21Z

This will land in the next release, which we're starting to work on assembling now. I can't promise dates because unforeseen circumstances can cause us to miss them, but this should be soon.

If you want to try to pull the pod from source you'll need the pod described in FirebaseFirestore.podspec.

Something like this in your Podfile might get you there (though YMMV since I don't actually test this way):

pod 'FirebaseFirestore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'master'

Substitute :commit => '7c7da198eeda7392803e812b0062eb3e9119cade' if you want this fix but none of the other unreleased progress.

wilhuff · 2017-12-11T21:58:53Z

If you're going to mention specific commits, you probably also want the fix in #522 too: :commit => '4da1d25ac51ca7236227987b7320fd3cd0f17ab1'

antonioallen · 2017-12-12T02:01:32Z

Thanks for the reply! So I tried adding the specific commit and I keep getting this error (see image). pod 'FirebaseFirestore', :git => 'https://github.com:firebase/firebase-ios-sdk.git', :commit => '4da1d25ac51ca7236227987b7320fd3cd0f17ab1'

antonioallen · 2017-12-12T02:02:40Z

Do I need to do something else before I run pod install?

morganchen12 · 2017-12-12T02:06:12Z

@antonioallen, you're now in the realm of unsupported experimentation, so our ability to assist you will be limited.

To start debugging your issue I encourage you use pod install --verbose to get a more complete error message and then start googling around. The CocoaPods docs are fairly comprehensive and they have a long and extensive list of issues that people have run into in the past on their GitHub, both of which will likely be of use to you.

Good luck!

antonioallen · 2017-12-12T02:43:12Z

Ok, will do thanks! Also, here is a more complete error message (see image). If you have any ideas I would greatly appreciate it! Error: Port number ended with 'f'

wilhuff · 2017-12-12T03:00:53Z

Sorry, that was a typo in the URL. It should have been:

https://github.com/firebase/firebase-ios-sdk.git

I've updated the other comment too so you can see it in context.

Let us know if this works out. We believe our recent changes have made this kind of interoperation between open-source pods and the released ones possible, but this isn't yet a fully released (or supported) method yet.

antonioallen · 2017-12-12T03:11:28Z

Aha! That was a sneaky one. However, now the error is a bit more cryptic. Is there a way to do this manually?

morganchen12 · 2017-12-12T03:15:37Z

There is. CocoaPods' verbose option lists all the commands it runs before running them, so you can just copy paste those commands into your terminal.

antonioallen · 2017-12-21T05:08:35Z

If anyone else encounters this is issue make sure you are on CocoaPods 1.4.0 as specified in the documentation. You can install pre-release versions of Pods by running this command: sudo gem install cocoapods --pre

Fixing race in FSTWriteStream

Merge branch 'master' of https://github.com/Firebase/firebase-ios-sdk

65d47c7

googlebot added the cla: yes label Nov 30, 2017

schmidt-sebastian changed the title ~~Mrschmidt crash~~ Fixing race in FSTWriteStream Nov 30, 2017

schmidt-sebastian assigned wilhuff Nov 30, 2017

schmidt-sebastian requested a review from wilhuff November 30, 2017 01:20

schmidt-sebastian force-pushed the mrschmidt-crash branch 2 times, most recently from e244bf2 to d8f624b Compare November 30, 2017 01:22

schmidt-sebastian added 3 commits November 29, 2017 17:23

Fixing race in FSTWriteStream

5db88f1

Merge branch 'master' of https://github.com/Firebase/firebase-ios-sdk

2f5ab0f

Merge branch 'master' into mrschmidt-crash

e0dbc12

schmidt-sebastian force-pushed the mrschmidt-crash branch 3 times, most recently from 33ef982 to 56fadd6 Compare November 30, 2017 05:25

Fixing tests

df76451

schmidt-sebastian force-pushed the mrschmidt-crash branch from 56fadd6 to df76451 Compare November 30, 2017 05:27

wilhuff suggested changes Nov 30, 2017

View reviewed changes

wilhuff assigned schmidt-sebastian and unassigned wilhuff Nov 30, 2017

wilhuff added the api: firestore label Nov 30, 2017

Checking for stream close in handleStreamClose

d200fe0

schmidt-sebastian force-pushed the mrschmidt-crash branch from 0c5e3b2 to d200fe0 Compare November 30, 2017 21:29

schmidt-sebastian assigned wilhuff and unassigned schmidt-sebastian Dec 1, 2017

wilhuff approved these changes Dec 1, 2017

View reviewed changes

Adding Changelog entry

47653ce

schmidt-sebastian merged commit 7c7da19 into master Dec 1, 2017

schmidt-sebastian mentioned this pull request Dec 1, 2017

Ignoring stream messages for closed streams #522

Merged

schmidt-sebastian mentioned this pull request Dec 12, 2017

Firestore crash on invalid permissions #536

Closed

mikelehen mentioned this pull request Jan 10, 2018

FIRESTORE INTERNAL ASSERTION FAILED: closeWithFinalState #596

Closed

minafarid pushed a commit to minafarid/firebase-ios-sdk that referenced this pull request Jun 6, 2018

Merge pull request firebase#510 from firebase/mrschmidt-crash

cd4c4c3

Fixing race in FSTWriteStream

firebase locked and limited conversation to collaborators Nov 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing race in FSTWriteStream #510

Fixing race in FSTWriteStream #510

schmidt-sebastian commented Nov 30, 2017 •

edited by wilhuff

Loading

wilhuff Nov 30, 2017

schmidt-sebastian Nov 30, 2017

wilhuff Nov 30, 2017

schmidt-sebastian Nov 30, 2017

wilhuff left a comment

wilhuff commented Dec 1, 2017

antonioallen commented Dec 11, 2017

wilhuff commented Dec 11, 2017 •

edited

Loading

wilhuff commented Dec 11, 2017

antonioallen commented Dec 12, 2017

antonioallen commented Dec 12, 2017

morganchen12 commented Dec 12, 2017

antonioallen commented Dec 12, 2017

wilhuff commented Dec 12, 2017 •

edited

Loading

antonioallen commented Dec 12, 2017

morganchen12 commented Dec 12, 2017

antonioallen commented Dec 21, 2017

Fixing race in FSTWriteStream #510

Fixing race in FSTWriteStream #510

Conversation

schmidt-sebastian commented Nov 30, 2017 • edited by wilhuff Loading

wilhuff Nov 30, 2017

Choose a reason for hiding this comment

schmidt-sebastian Nov 30, 2017

Choose a reason for hiding this comment

wilhuff Nov 30, 2017

Choose a reason for hiding this comment

schmidt-sebastian Nov 30, 2017

Choose a reason for hiding this comment

wilhuff left a comment

Choose a reason for hiding this comment

wilhuff commented Dec 1, 2017

antonioallen commented Dec 11, 2017

wilhuff commented Dec 11, 2017 • edited Loading

wilhuff commented Dec 11, 2017

antonioallen commented Dec 12, 2017

antonioallen commented Dec 12, 2017

morganchen12 commented Dec 12, 2017

antonioallen commented Dec 12, 2017

wilhuff commented Dec 12, 2017 • edited Loading

antonioallen commented Dec 12, 2017

morganchen12 commented Dec 12, 2017

antonioallen commented Dec 21, 2017

schmidt-sebastian commented Nov 30, 2017 •

edited by wilhuff

Loading

wilhuff commented Dec 11, 2017 •

edited

Loading

wilhuff commented Dec 12, 2017 •

edited

Loading