-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken thread safety by widening critical section on ConnectionPool #447
Fix broken thread safety by widening critical section on ConnectionPool #447
Conversation
Previously, the equality between cache.key and new_key was evaluated outside the critical section of `caches.synchronize`. When these 2 keys are evaluated as different by concurrent processes, caches[cache.key] is deleted more than once and `nil` is assigned to caches[new_key] after 1 thread passed critical section. This patch fix the issue by evaluating equality inside the critical section of `caches.synchronize`.
Generated by 🚫 Danger |
7549c6b
to
918cfd8
Compare
Hi @aeroastro thanks for the contribution. It looks like a totally reasonable change and I'm sorry you had (I expect a tricky) bug to diagnose. Thank you for taking the time. With a racy piece of code like this, where the placement of that check looks quite innocuous, I would strongly prefer a test for that, but I also sympathise with how difficult simulating racy conditions can be in general. @mattbrictson @will-in-wi how do you feel about it ? |
This looks great! Thanks for working on such a tricky bug. I'd love a test, but such tests are so hard to write that I'd be okay merging this without one. I'm not sure how I would even approach such a test… |
The only way I could imagine testing for this is with very invasive runtime patching to force a race condition, I'd prefer just to merge as-is. @mattbrictson if you disagree, we can swing around and try and write a test for this somehow? |
Thank you for merging this ❤️ |
Update ruby-sshkit package to 1.20.0. ## [1.20.0][] (2019-08-03) * [#468](capistrano/sshkit#468): Make `upload!` take a `:verbosity` option like `exec` does - [@grosser](https://github.com/grosser) ## [1.19.1][] (2019-07-02) * [#465](capistrano/sshkit#456): Fix a regression in 1.19.0 that prevented `~` from being used in Capistrano paths, e.g. `:deploy_to`, etc. - [@grosser](https://github.com/grosser) ## [1.19.0][] (2019-06-30) * [#455](capistrano/sshkit#455): Ensure UUID of commands are stable in logging - [@lazyatom](https://github.com/lazyatom) * [#453](capistrano/sshkit#453): `as` and `within` now properly escape their user/group/path arguments, and the command nested within an `as` block is now properly escaped before passing to `sh -c`. In the unlikely case that you were manually escaping commands passed to SSHKit as a workaround, you will no longer need to do this. See [#458](capistrano/sshkit#458) for examples of what has been fixed. - [@grosser](https://github.com/grosser) * [#460](capistrano/sshkit#460): Handle IPv6 addresses without port - [@will-in-wi](https://github.com/will-in-wi) ## [1.18.2][] (2019-02-03) * [#448](capistrano/sshkit#448): Fix misbehaving connection eviction loop when disabling connection pooling - [Sebastian Cohnen](https://github.com/tisba) ## [1.18.1][] (2019-01-26) * [#447](capistrano/sshkit#447): Fix broken thread safety by widening critical section - [Takumasa Ochi](https://github.com/aeroastro)
Bug Detail
Previously, the equality between cache.key and new_key
was evaluated outside the critical section of
caches.synchronize
.When these 2 keys are evaluated as different by concurrent processes,
caches[cache.key] is deleted more than once and
nil
is assigned tocaches[new_key] after 1 thread passed critical section.
When
SSHKit::Backend::ConnectionPool#close_connections
is invoked at the end of deployment, this pooled
nil
connection eventuallyresults in
NoMethodError
whenclear
is called.sshkit/lib/sshkit/backends/connection_pool.rb
Line 78 in 749337b
How to Fix
This patch fix the issue by evaluating equality
inside the critical section of
caches.synchronize
.Test
In my environment this patch have fixed the problem.
Although I believe existent tests here are sufficient to prove the validity of this patch,
I will consider adding test for simulating race condition if really required.