Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better deal with "lost" connections for async Redis #2999

Merged
merged 4 commits into from
Oct 16, 2023

Conversation

kristjanvalur
Copy link
Contributor

@kristjanvalur kristjanvalur commented Oct 12, 2023

ConnectionPool keeps a WeakSet of in_use connections, allowing lost ones to be collected. Collection produces a warning and closes the underlying transport.

Pull Request check-list

Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?
  • Was the change added to CHANGES file?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

Recent changes, which removed the del handling of lost connections, have illustrated how fragile many applications
are. This PR attempts to improve on this, by better tracking "lost" connections.
See #2995 for various manifestations of this problem.

  • ConnectionPool objects now keep a WeakSet of in_use_connections. This allows the garbage collector to find such
    objects if they are not returned to the pool for some reason.
  • Connection objects now have a __del__ method which outputs a ResourceWarning in the same way as other IO classes
    in the standard library. For some reason, the underlying "stream" objects do not get garbage collected promptly. The
    Connection will perform a non-async close() call on these connections. This may or may not work, since there may or may
    not be an active event loop in place, but in that case, an error is emitted. In both cases, the developer should become aware
    of there being a problem.
  • Redis objects call _close() on their connection objects if they get garbage collected.

NOTE: To see the ResourceWarnings, they need to be enabled. By default they are suppressed. Add, for instance
the -W default::ResourceWarning to the python command line.

@kristjanvalur kristjanvalur force-pushed the kristjan/async-resource branch 3 times, most recently from 5a848e9 to eb6c632 Compare October 12, 2023 14:32
@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2023

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (5391c5f) 91.36% compared to head (c497789) 91.35%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2999      +/-   ##
==========================================
- Coverage   91.36%   91.35%   -0.02%     
==========================================
  Files         126      126              
  Lines       32617    32652      +35     
==========================================
+ Hits        29802    29829      +27     
- Misses       2815     2823       +8     
Files Coverage Δ
redis/asyncio/client.py 91.99% <100.00%> (-0.15%) ⬇️
redis/asyncio/connection.py 83.90% <90.00%> (+0.24%) ⬆️
tests/test_asyncio/test_connection.py 89.45% <46.15%> (-4.55%) ⬇️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ConnectionPool keeps a WeakSet of in_use connections, allowing lost ones to be collected.
Collection produces a warning and closes the underlying transport.
@kristjanvalur kristjanvalur force-pushed the kristjan/async-resource branch from eb6c632 to 8a6a114 Compare October 12, 2023 14:50
@kristjanvalur kristjanvalur marked this pull request as ready for review October 12, 2023 15:59
@kristjanvalur kristjanvalur marked this pull request as draft October 13, 2023 10:11
@kristjanvalur kristjanvalur force-pushed the kristjan/async-resource branch from 2ad2364 to e86bdce Compare October 13, 2023 14:56
@kristjanvalur kristjanvalur force-pushed the kristjan/async-resource branch from aed6e26 to c497789 Compare October 13, 2023 16:17
@kristjanvalur kristjanvalur marked this pull request as ready for review October 13, 2023 16:23
@dvora-h dvora-h added the maintenance Maintenance (CI, Releases, etc) label Oct 16, 2023
@dvora-h dvora-h merged commit 14be2da into redis:master Oct 16, 2023
@kristjanvalur kristjanvalur deleted the kristjan/async-resource branch November 10, 2023 14:19
dvora-h pushed a commit that referenced this pull request Feb 25, 2024
* Allow tracking/reporting and closing of "lost" connections.
ConnectionPool keeps a WeakSet of in_use connections, allowing lost ones to be collected.
Collection produces a warning and closes the underlying transport.

* Add tests for the __del__ handlers of async Redis and Connection objects

* capture expected warnings in the test

* lint
dvora-h added a commit that referenced this pull request Feb 26, 2024
* Bump actions/checkout from 3 to 4 (#2969)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rojopolis/spellcheck-github-actions from 0.33.1 to 0.34.0 (#2970)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix type hint (#2963)

Co-authored-by: d184230 <[email protected]>

* Don't perform blocking connect inside the BlockingConnectionQueue Condition variable. (#2997)

* Creating CODEOWNERS for the examples (#2993)

* Close various objects created during asyncio tests (#3005)

* Close various objects created during asyncio tests

* Fix resource leake in test_cwe_404.py
Need to wait for individual handler tasks when shutting down server.

* Linking to Redis resources (#3006)

* Add GEOSHAPE field type for index creation of RediSearch (#2957)

* first pass of geoshape index type

* first attempt at test, but demonstrates the initial commit is broken

* fix new field + fix tests

* work on linter

* more linter

* try to mark test with correct fixture

* fix linter

* Better deal with "lost" connections for async Redis (#2999)

* Allow tracking/reporting and closing of "lost" connections.
ConnectionPool keeps a WeakSet of in_use connections, allowing lost ones to be collected.
Collection produces a warning and closes the underlying transport.

* Add tests for the __del__ handlers of async Redis and Connection objects

* capture expected warnings in the test

* lint

* Update client.py sleep_time typing for run_in_thread function (#2977)

Changed from 
`sleep_time: int = 0`
to
`sleep_time: float = 0.0`
To avoid Pylance complaining: 
`Argument of type "float" cannot be assigned to parameter "sleep_time" of type "int" in function "run_in_thread"
  "float" is incompatible with "int"`

* Fix BlockingConnectionPool.from_url parsing of timeout in query args #2983 (#2984)

Co-authored-by: Romain Fliedel <[email protected]>

* Fix parsing resp3 dicts (#2982)

* Update ocsp.py (#3022)

* Bump rojopolis/spellcheck-github-actions from 0.34.0 to 0.35.0 (#3060)

Bumps [rojopolis/spellcheck-github-actions](https://github.com/rojopolis/spellcheck-github-actions) from 0.34.0 to 0.35.0.
- [Release notes](https://github.com/rojopolis/spellcheck-github-actions/releases)
- [Changelog](https://github.com/rojopolis/spellcheck-github-actions/blob/master/CHANGELOG.md)
- [Commits](rojopolis/spellcheck-github-actions@0.34.0...0.35.0)

---
updated-dependencies:
- dependency-name: rojopolis/spellcheck-github-actions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Use `disable_decoding` in async `read_response`. (#3042)

* Add "sum" to DUPLICATE_POLICY documentation of TS.CREATE, TS.ADD and TS.ALTER (#3027)

* Update advanced_features.rst (#3019)

* Fix typos. (#3016)

* Fix parsing of `FT.PROFILE` result (#3063)

* Fix parsing of ft.profile result

* test

* Make the connection callback methods public again, add documentation (#2980)

* Fix reported version of deprecations in asyncio.client (#2968)

* Allow the parsing of the asking command to forward original options (#3012)

Co-authored-by: dvora-h <[email protected]>

* Fix Specifying Target Nodes broken hyperlink (#3072)

The typo cause hyperlinks to fail.

* Fix return types in json commands (#3071)

* Fix return types in JSONCommands class

* Update CHANGES

* fix acl_genpass with bits (#3062)

* Bump github/codeql-action from 2 to 3 (#3096)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@v2...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/upload-artifact from 3 to 4 (#3097)

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3 to 4.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v3...v4)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/setup-python from 4 to 5 (#3095)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 5.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Always sending codecov (#3101)

* filter commits for main branch (#3036)

* fix(docs): organize cluster mode part of lua scripting (#3073)

* Fix typing for `HashCommand.hdel` (#3029)

Co-authored-by: dvora-h <[email protected]>

* Adding lock_name to LockError (#3023)

* Adding lock_name to LockError

* Adding lock_name to LockError

---------

Co-authored-by: dvora-h <[email protected]>

* Fix objlen type hint (#2966)

* Fix objlen type hint

* Update redis/commands/json/commands.py

Co-authored-by: dvora-h <[email protected]>

* linters

---------

Co-authored-by: dvora-h <[email protected]>

* Fix type hint of arbitrary argument lists (#2908)

* Fix: hset unexpectedly mutates the list passed to items (#3103)

* Fix possible pipeline connections leak (#3104)

* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review

* Add modules support to async RedisCluster (#3115)

* Fix grammer in BlockingConnectionPool class documentation (#3120)

Co-authored-by: ahmedabdou14 <root@xps>

* release already acquired connections on ClusterPipeline, when get_connection raises an exception (#3133)

Signed-off-by: zach.lee <[email protected]>

* Bump actions/stale from 3 to 9 (#3132)

Bumps [actions/stale](https://github.com/actions/stale) from 3 to 9.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v3...v9)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump codecov/codecov-action from 3 to 4 (#3131)

Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 3 to 4.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v3...v4)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Allow to control the minimum SSL version (#3127)

* Allow to control the minimum SSL version

It's useful for applications that has strict security requirements.

* Add tests for minimum SSL version

The commit updates test_tcp_ssl_connect for both sync and async
connections. Now it sets the minimum SSL version. The test is ran with
both TLSv1.2 and TLSv1.3 (if supported).

A new test case is test_tcp_ssl_version_mismatch. The test added for
both sync and async connections. It uses TLS 1.3 on the client side,
and TLS 1.2 on the server side. It expects a connection error. The
test is skipped if TLS 1.3 is not supported.

* Add example of using a minimum TLS version

* docs: Add timeout parameter for get_message example (#3129)

The `get_message()` method in asyncio PubSub has a `timeout` argument that defaults to 0.0, causing it to immediately return. This can cause high CPU usage with the given code example and should not be recommended. By setting `timeout=None`, it works with much more efficient resource usage.

* Revert stale isuue version update (#3142)

* Update connection.py (#3149)

Exception ignored in: <function Redis.__del__ at ...>
Traceback ....
TypeError: 'NoneType' object cannot be interpreted as an integer.

This happens when closing the connection within a spawned Process (multiprocess).

* Fix retry logic for pubsub and pipeline (#3134)

* Fix retry logic for pubsub and pipeline

Extend the fix from bea7299 to apply to
pipeline and pubsub as well.

Fixes #2973

* fix isort

---------

Co-authored-by: dvora-h <[email protected]>

* Update install_requires (#3109)

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: zach.lee <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Artem Diubkov <[email protected]>
Co-authored-by: d184230 <[email protected]>
Co-authored-by: Kristján Valur Jónsson <[email protected]>
Co-authored-by: Chayim <[email protected]>
Co-authored-by: Shaya Potter <[email protected]>
Co-authored-by: Bosheng (Daniel) Zhang <[email protected]>
Co-authored-by: Romain <[email protected]>
Co-authored-by: Romain Fliedel <[email protected]>
Co-authored-by: Aniket Patil <[email protected]>
Co-authored-by: Stanisław Denkowski <[email protected]>
Co-authored-by: Pedram Parsian <[email protected]>
Co-authored-by: BackflipPenguin <[email protected]>
Co-authored-by: AYMEN Mohammed <[email protected]>
Co-authored-by: Zachary Ware <[email protected]>
Co-authored-by: Tyler Bream (Event pipeline) <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Pármenas Haniel <[email protected]>
Co-authored-by: Wei-Hsiang (Matt) Wang <[email protected]>
Co-authored-by: Dmitry Kulazhenko <[email protected]>
Co-authored-by: Dan Lousqui <[email protected]>
Co-authored-by: trkwyk <[email protected]>
Co-authored-by: SwordHeart <[email protected]>
Co-authored-by: Jason <[email protected]>
Co-authored-by: Ahmed Ashraf <[email protected]>
Co-authored-by: ahmedabdou14 <root@xps>
Co-authored-by: Dongkeun Lee <[email protected]>
Co-authored-by: poiuj <[email protected]>
Co-authored-by: Qiangning Hong <[email protected]>
Co-authored-by: wKollendorf <[email protected]>
Co-authored-by: Will Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintenance (CI, Releases, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants