Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Salt python client "get_full_returns" seems return data from incorrect jid #50927

Closed
Yukinoshita-Yukino opened this issue Dec 20, 2018 · 9 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior Confirmed Salt engineer has confirmed bug/feature - often including a MCVE fixed-pls-verify fix is linked, bug author to confirm fix
Milestone

Comments

@Yukinoshita-Yukino
Copy link

Versions:

salt-master: 2018.3.2
salt-minion: 2015.5.10 (not sure if this old version of minion is causing this problem)
python salt client: 2018.3.0 (pip install salt==2018.3.0)

Problem:

Seems like when I call "get_full_returns" when sending out async cmd(cmd_async) I got some return data not from the jid I was given by the cmd_async call.

Steps:

Let's say I have two Python program running individually, using different Python venv, program A and program B.
They are both sending cmd to three servers, server 1, 2, and 3.
Program A sends out cmd by using async cmd

# Program A
client = salt.client.LocalClient()
jid = client.cmd_async(target, 'cmd.run', [cmd, 'shell="/bin/bash"', 'runas="{}"'.format(runas)])
rv = client.get_full_returns(jid, [target], timeout=180)

Program B just sends out cmd using list mode

# Program B
client = salt.client.LocalClient()
ret = client.cmd(target_list, 'cmd.run', [cmd, 'shell="/bin/bash"', 'runas="root"'], tgt_type='list')

Assuming program A is running job on server 1 with jid 100, program B is running job on server 1, 2, 3.
When I call something like rv = client.get_full_returns(100, ['server 1'], timeout=180) in program A, I got return from job triggered by program B on server 1.

Any idea is this due to the old version of salt-minion or misuse of the client call or is it a bug?
I have checked the release note from 2018.3.1 to 2018.3.3 and did't find something that seems related to this problem.

@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 20, 2018

is the program B producing the same JID as program A?

@Ch3LL Ch3LL added the info-needed waiting for more info label Dec 20, 2018
@Ch3LL Ch3LL added this to the Blocked milestone Dec 20, 2018
@Yukinoshita-Yukino
Copy link
Author

Yukinoshita-Yukino commented Dec 20, 2018

is the program B producing the same JID as program A?

I checked the JID by looking into the debug log from salt and I believe no, they don't have the same JID.

This problem occurs quite often in my situation, as program B is sending out cmd at the frequence around 90calls/2min. In program A each time I call get_full_returns there's around (or maybe higher) 50% chance I got some return from the job created by program B. That's quite a abnormally high chance for 10+digits JIDs I think.

BTW, there's also a program C, which is also sending out cmds

# Program C
client = salt.client.LocalClient()
ret = client.cmd(target_list, 'cmd.run', [cmd, 'shell="/bin/bash"', 'runas="root"'])

And program C works just fine. Therefore this problem seems only occur when using cmd_async or get_full_returns I guess.

@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 21, 2018

thanks for clearing that up. I was able to replicate this and here was my test case:

➜  salt git:(ad0d83c6a8) ✗ cat ~/programa.py 
import salt.client
client = salt.client.LocalClient()
target = 'the_cake_is_a_lie'
cmd = 'echo testa'
jid = client.cmd_async(target, 'cmd.run', [cmd, 'shell="/bin/bash"'])
rv = client.get_full_returns(jid, [target], timeout=180)
print(rv)
➜  salt git:(ad0d83c6a8) ✗ cat ~/programb.py 
import salt.client
target_list = ['the_cake_is_a_lie']
cmd = 'echo testb'
client = salt.client.LocalClient()
ret = client.cmd(target_list, 'cmd.run', [cmd, 'shell="/bin/bash"', 'runas="root"'], tgt_type='list')
print(ret)

I then ran programb like so:
for i in {1..100}; do sudo python2 ~/programb.py; done
and then ran programa and sometimes got programb's return.

I believe I was able to narrow it down to get_event_iter_returns here: https://github.com/saltstack/salt/blob/v2018.3.3/salt/client/__init__.py#L1664

the only thing it does with the jid is check if it exists here:https://github.com/saltstack/salt/blob/v2018.3.3/salt/client/__init__.py#L1677

and then we just try to get the event off of the event bus, which would make sense as to why you sometimes see the return from program b, because i'm not seeing any sort of jid filtering here. raw does have the jid information in it so I'm not sure as to why we aren't filtering to make sure we are getting hte correct event.

ping @saltstack/team-core does anyone know why we wouldn't try to filter on jid here? If not i'll submit a PR to fix this issue.

@Ch3LL Ch3LL added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed info-needed waiting for more info labels Dec 21, 2018
@Ch3LL Ch3LL self-assigned this Dec 21, 2018
@thatch45
Copy link
Contributor

Yes, we should filter on the jid, that is at least how I remember originally writing it

@stale
Copy link

stale bot commented Jan 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Jan 9, 2020
@waynew waynew added Bug broken, incorrect, or confusing behavior Confirmed Salt engineer has confirmed bug/feature - often including a MCVE fixed-pls-verify fix is linked, bug author to confirm fix and removed Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged labels Jan 15, 2020
@stale
Copy link

stale bot commented Jan 15, 2020

Thank you for updating this issue. It is no longer marked as stale.

3 similar comments
@stale
Copy link

stale bot commented Jan 15, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale
Copy link

stale bot commented Jan 15, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale
Copy link

stale bot commented Jan 15, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Confirmed Salt engineer has confirmed bug/feature - often including a MCVE fixed-pls-verify fix is linked, bug author to confirm fix
Projects
None yet
Development

No branches or pull requests

5 participants