-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watch instances not getting garbage collected #985
Comments
Thanks for the very detailed reproduction, I can confirm I see the same issue. It looks like the leak is happening in the shared api_core code. I opened an issue, and a potential fix After those are through, we may want to clean up |
Great to hear! Thank you for looking into that. If there's no downsides to it, I'd throw my vote in for clearing doc_watch._snapshot_callback even though that's not the direct cause of this particular issue. In my code I'm pointing the Watch callback at a bound method of an object which itself holds a ref to the Watch, so I get a reference cycle. I'm breaking this cycle by clearing my Watch ref after unsubscribing, but having the Watch clear its own callback ref might help keep cleanup deterministic by default in more cases instead of having to wait for the cyclic garbage collector. |
I opened a PR to address this while we wait on the upstream fix. Thanks for your patience! |
Hi; just a quick followup on this: I'm now google-cloud-firestore 2.20.0 with the workaround (though it seems the underlying bidi fix in google-api-core has not come through yet). I'm now running tons of listeners and everything seems to be behaving well. However I have seen a small number of these errors in my logs:
It looks like there's some rare condition where data is coming through after the close has cleared the _on_response callback. Should this be another minor fix on the bidi end to watch out for the None case or whatnot? |
Hmm ok, thanks for letting me know Calling It seems like maybe we should add a null check before calling the callback. But I'd like to find the root cause if we can. I wonder why the thread would not be exiting... (maybe it's just slow to exit?) We did discuss the possibility or streams being re-opened, but I don't think that should be relevant here |
In my case I actually pretty much always get the 'WARNING:google.api_core.bidi:Background thread did not exit.' log. You can see in my original post above that it appears in my repro case output. So I guess that would explain how that error case is possible. Should I poke around and try to figure out what is causing that thread to fail to exit on my end? |
Environment details
google-cloud-firestore
version: 2.19.0I'm working on a Python server app which creates/destroys a decent number of document listeners as part of its operation, and have noticed some of my objects being kept alive longer than I'd expect. I have traced this down to Firestore Watch objects hanging around holding onto callbacks for a long time after I've unsubscribed and released all references to them.
I've put together a minimal repro case which demonstrates a reluctant-to-die Watch object, and also a workaround example showing how clearing a few internal fields after an unsubscribe call seems to break cycles or whatnot and allow it to go down immediately.
Steps to reproduce
test_watch_cleanup()
call from the code below. It takes a Client instance, creates a Watch, unsubscribes the Watch a few seconds later, and finally spins off a thread holding only a weak-ref and waits for the Watch to be garbage collected. In my case this results in the Watch living on indefinitely or at least for a long while.Code example
Results
With WORKAROUND=False I get:
With WORKAROUND=True I get:
Curious if others get the same results.
If so, would it make sense to add something similar to my workaround code as part of unsubscribe() or whatnot to allow these things to go down more smoothly?
The text was updated successfully, but these errors were encountered: