-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Salt exit with bound method Client.__del__ of <etcd.client.Client object at ignored #57196
Comments
@pankajghadge thank you for report. Any details about your configuration related to etcd? |
@DmitryKuzmenko I think this issue is not related to etcd service.
When I run the same command directly from minion, it always work
By the way what etcd Configuration do you need? Specify command if possible One strange thing I have observed when I restart salt-master error doesn't come for 1-2 days later on it starts coming. One more update:
Error: This command doesn't show error but minions are changing for every run
|
@pankajghadge thank you for additional details. |
@DmitryKuzmenko `cache: etcd etcd.host: remote_host Remote etcd server version:
etcd systemctl.service
|
Hello @DmitryKuzmenko, I work with Pankaj. We used etcd as backend for minion data cache to synchronize mine data between all master and syndics. |
@DmitryKuzmenko Any finding or progress regarding this issue ? |
Sorry, I've stopped on installing environment and switched to another task. Now I'm back on this. |
I still can not reproduce this. But I have found an upstream bug in @pankajghadge are you able to try it? |
@DmitryKuzmenko |
I've just re-read all our discussion. Could you please detail some more things?
This log is from
I think the |
In every salt-master configuration we have entry of salt master of master (Multi master) We have total:
detailed Trace log
|
@pankajghadge thank you for information. I still can not reproduce it on my side. |
@pankajghadge, maybe a stupid question, but did you recently switch from python2 to python3? |
@DmitryKuzmenko, this is a scoping issue and is based on Python's GC. So it won't be easy to reproduce unless your interpreter's garbage collection is aggressive. You might be able to reproduce by explicitly deleting the etcd client when it has gone out of scope, and then calling You can likely consider it an upstream bug, though, as python-etcd should not be trying to access anything in the object at the time it's being deleted anyways because members of said object aren't guaranteed to exist at the time of deletion (an antipattern). |
@arizvisa thank you for comment. I understand the possible reasons of the issue. And I've done a number of tests to try to reproduce it including small synthetic tests playing around destruction of the etcd client. The best way to fix it is to rewrite etcd client to be a context manager. But it's a big enough change in the upstream project. On the other hand there is etcd3 client that is already written in this way. We could port our code to it that looks better from my view point but it is not 2 lines patch because etcd3 API are different. |
Agreed. It should definitely be a context manager, or even persisted. Yeah actually, I've been trying to get my rewrite of the etcd returner merged into master since saltstack was split up into the "develop" branch like 2 years ago (#51363, with another addition in PR #55186). Right now it's stuck in the "Merge ports into master" project, but once that finaally gets merged into the "master" branch. I'd be willing to re-implement the logic for the etcd returner to use the etcd3 api. |
no we didn't switch to python 3 |
@DmitryKuzmenko I don't know how to patch etcd client python package.
|
Hey @DmitryKuzmenko, do you think it would be safe to assign the connection globally (in the module) so that it doesn't go out of scope, or is there some way to attach the etcd connection object somewhere so it doesn't get gc'd? Or do you think that it would just be safer to explicitly call |
@arizvisa the connection is already global in the cache module. Maybe that's the issue. We can probably try to move it out of the global scope but this will produce a performance regression. Probably it is worth to try to keep the client in the |
@pankajghadge could you try to patch salt on master node where you run salt command with this patch and try to reproduce the issue again? diff --git a/salt/cache/etcd_cache.py b/salt/cache/etcd_cache.py
index 97d43edd82..da7a33d399 100644
--- a/salt/cache/etcd_cache.py
+++ b/salt/cache/etcd_cache.py
@@ -67,7 +67,7 @@ if HAS_ETCD:
etcd._log.setLevel(logging.INFO) # pylint: disable=W0212
log = logging.getLogger(__name__)
-client = None
+etcd_kwargs = None
path_prefix = None
# Module properties
@@ -90,9 +90,10 @@ def __virtual__():
def _init_client():
'''Setup client and init datastore.
'''
- global client, path_prefix
- if client is not None:
- return
+ global etcd_kwargs, path_prefix
+
+ if etcd_kwargs is not None:
+ return etcd.Client(**etcd_kwargs)
etcd_kwargs = {
'host': __opts__.get('etcd.host', '127.0.0.1'),
@@ -117,13 +118,14 @@ def _init_client():
except etcd.EtcdKeyNotFound:
log.info("etcd: Creating dir %r", path_prefix)
client.write(path_prefix, None, dir=True)
+ return client
def store(bank, key, data):
'''
Store a key value.
'''
- _init_client()
+ client = _init_client()
etcd_key = '{0}/{1}/{2}'.format(path_prefix, bank, key)
try:
value = __context__['serial'].dumps(data)
@@ -138,7 +140,7 @@ def fetch(bank, key):
'''
Fetch a key value.
'''
- _init_client()
+ client = _init_client()
etcd_key = '{0}/{1}/{2}'.format(path_prefix, bank, key)
try:
value = client.read(etcd_key).value
@@ -157,7 +159,7 @@ def flush(bank, key=None):
'''
Remove the key from the cache bank with all the key content.
'''
- _init_client()
+ client = _init_client()
if key is None:
etcd_key = '{0}/{1}'.format(path_prefix, bank)
else:
@@ -195,7 +197,7 @@ def ls(bank):
Return an iterable object containing all entries stored in the specified
bank.
'''
- _init_client()
+ client = _init_client()
path = '{0}/{1}'.format(path_prefix, bank)
try:
return _walk(client.read(path))
@@ -211,7 +213,7 @@ def contains(bank, key):
'''
Checks if the specified bank contains the specified key.
'''
- _init_client()
+ client = _init_client()
etcd_key = '{0}/{1}/{2}'.format(path_prefix, bank, key)
try:
r = client.read(etcd_key) |
Hello @DmitryKuzmenko, I just applied your patch to a master and that solves the problem.
With the patch:
|
@malletp thank you! Just one thing: to test the change I have to fully cover the module with tests =) I'll do it some day. |
@DmitryKuzmenko : Also swap memory is 0 on salt master and minions
|
Description
After upgrading salt master and minion to 2019.2.4, we started getting an error
We have upgraded due to CVE issue
This error occurs after several execution of salt run.
When we restart the salt-master then it goes for a while, but after few hours it starts appearing again
Setup
(Please provide relevant configs and/or SLS files (be sure to remove sensitive info).
Steps to Reproduce the behavior
(Include debug logs if possible and relevant)
salt 'target' cmd.run 'ls -l /var' -l debug
Warning on minion log:
For simple salt.run it occurs after 10-15 continuous salt call from master
Expected behavior
Error must go, instead it should print expected output
Screenshots
If applicable, add screenshots to help explain your problem.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: