-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing race in FSTWriteStream #510
Conversation
e244bf2
to
d8f624b
Compare
33ef982
to
56fadd6
Compare
56fadd6
to
df76451
Compare
Firestore/Source/Remote/FSTStream.m
Outdated
@@ -347,6 +347,10 @@ - (void)closeWithFinalState:(FSTStreamState)finalState error:(nullable NSError * | |||
[self.workerDispatchQueue verifyIsCurrentQueue]; | |||
[self cancelIdleCheck]; | |||
|
|||
FSTAssert(self.delegate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this move up to live alongside the other assertion, just to group all the sanity checking together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -58,7 +58,10 @@ - (void)dispatchAsyncAllowingSameQueue:(void (^)(void))block { | |||
|
|||
- (void)dispatchAfterDelay:(NSTimeInterval)delay block:(void (^)(void))block { | |||
dispatch_time_t delayNs = dispatch_time(DISPATCH_TIME_NOW, (int64_t)(delay * NSEC_PER_SEC)); | |||
dispatch_after(delayNs, self.queue, block); | |||
dispatch_after(delayNs, self.queue, ^() { | |||
// Make sure that we prioritize tasks that are already queued for immediate execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't entirely understand why this double dispatch actually fixes the problem you're describing. There's still a race condition even after this, but it's now just much smaller.
dispatch_after
still executes on the same serial queue so this block cannot be called concurrently with any other executing block. It doesn't really matter which block executes the close, does it? We still need to deal with the fact after introducing idle timeouts there are now multiple paths to closure and so the handleStreamClose
assertion can't be correct.
In short: either the server or the client can close the connection and the event handler that closes it could be either handleStreamClose
or handleIdleCloseTimer
. The latter already guards against the possibility that something already closed the stream--we just need to do something similar in handleStreamClose
.
We need to allow for the possibility that between the point where we get a callback from gRPC indicating close and the point where our own dispatch queue actually handles the server close event handleIdleCloseTimer
could have run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, you convinced me - there might be a slim possibility if the idle callback runs while another operation queues a depended block on the queue. So yes, it is better to just let handleStreamClose
deal with this directly. I originally wanted to keep the state machine cleaner and avoid this extra check.
0c5e3b2
to
d200fe0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please add a CHANGELOG entry |
Any idea when this fix will be added to the Production Firebase Distribution for iOS? Or which pod do I need to update from this repo to get the fix? |
This will land in the next release, which we're starting to work on assembling now. I can't promise dates because unforeseen circumstances can cause us to miss them, but this should be soon. If you want to try to pull the pod from source you'll need the pod described in Something like this in your Podfile might get you there (though YMMV since I don't actually test this way):
Substitute |
If you're going to mention specific commits, you probably also want the fix in #522 too: |
Do I need to do something else before I run pod install? |
@antonioallen, you're now in the realm of unsupported experimentation, so our ability to assist you will be limited. To start debugging your issue I encourage you use Good luck! |
Sorry, that was a typo in the URL. It should have been:
I've updated the other comment too so you can see it in context. Let us know if this works out. We believe our recent changes have made this kind of interoperation between open-source pods and the released ones possible, but this isn't yet a fully released (or supported) method yet. |
There is. CocoaPods' verbose option lists all the commands it runs before running them, so you can just copy paste those commands into your terminal. |
If anyone else encounters this is issue make sure you are on CocoaPods 1.4.0 as specified in the documentation. You can install pre-release versions of Pods by running this command: sudo gem install cocoapods --pre |
Fixing race in FSTWriteStream
This should address https://b.corp.google.com/issues/69701162
There are two races that this PR addresses:
The one I found during local testing is that we call
notifyStreamInterruptedWithError
in FSTStream before we clear the stream delegate. It is then possible that FSTRemoteStore re-opens the write stream right away (synchronously). The assert instartWithDelegate
that verifies that no delegate is set triggers in this case.The other one was root-caused by banging my head on the keyboard a couple times. I believe that it is possible to corrupt the stream state if the idle callback gets schedule while other operations are active. The following sequence should trigger this:
writeStreamFinished
is invoked and scheduleshandleStreamClose
on the dispatch queue.Initial
handleStreamClose
runs and asserts since the stream is no longer active.I tried to find docs (or sources) for Apple's dispatch queue to verify how events are queued. I was hoping that the guarantees where similar to JS:
setImmediate()
always get priority oversetTimeout()
. This is however not the case - using a small test app I was able to verify that iOS does not prioritize existingdispatch_async
overdispatch_after(now)
.Fixes: #490