-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consume job_explanation
from runner, fix error reporting error
#13482
Conversation
This should go in along-side of ansible/ansible-runner#1186 |
What I originally tested this against was the runner change: @@ -68,6 +73,7 @@ class Worker(object):
_output = sys.stdout.buffer
self._input = _input
self._output = _output
+ self._output.write(b'surprise from alan!\n')
self.kwargs = kwargs
self.job_kwargs = None But I've realized there is a shortcoming for a different class of errors. Consider: diff --git a/ansible_runner/streaming.py b/ansible_runner/streaming.py
index 608b972..df14bfc 100644
--- a/ansible_runner/streaming.py
+++ b/ansible_runner/streaming.py
@@ -68,6 +68,7 @@ class Worker(object):
_output = sys.stdout.buffer
self._input = _input
self._output = _output
+ raise Exception('alan!')
self.kwargs = kwargs
self.job_kwargs = None This results in a bad user experience: But! As per some speculative data related to #13469, we know that receptor may not be fully functional. So we might need the As I've stewed on this problem, I don't see a way out other than a special-case for the JSON parse error from the processing code. So I think I have an idea of how to move forward - which involves combining all of the data we can possibly manage, and a special string comparison for that specific message. |
Error handling is still very broken without #13494 fixed, so I'm not sure what I'm going to do now. |
But I did adjust my ansible-runner PR to gracefully handle the Exception case I mentioned before. I think both of these are good changes, but it's still hard to get quality output from error cases. |
@@ -437,9 +437,9 @@ def _run_internal(self, receptor_ctl): | |||
lines = resultfile.readlines() | |||
receptor_output = b"".join(lines).decode() | |||
if receptor_output: | |||
self.task.runner_callback.delay_update(result_traceback=receptor_output) | |||
self.task.runner_callback.delay_update(result_traceback=f'Worker output:\n{receptor_output}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan here, now, is that ansible-runner will write the line to job_explanation
. By doing that, we might echo the output twice in two places. Because of that, we need some better labeling of what data we're showing, so I added this stuff.
@@ -437,9 +437,9 @@ def _run_internal(self, receptor_ctl): | |||
lines = resultfile.readlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recent discussion is that I might delete resultsock.setblocking(False)
5f27f77
to
10567b3
Compare
I added tests to my related ansible-runner PR. This is one of the highest priority items I have to make sure we get in, but I don't want to merge it until the runner change lands (because don't know what the UX of that would be). |
a001a96
to
45c3244
Compare
I have test cases up here: https://github.com/AlanCoding/bad-execution-environments Let me review those... The "traceback" test is not tested right now due to linked receptor issue ansible/receptor#736 ending_linestarting_lineartifacts |
45c3244
to
6780bfe
Compare
job_explanation
from runner, fix error reporting error
6780bfe
to
1569d44
Compare
Add extra formatting to error messages for clarity Fix nonblocking related tracebacks, add logging
1569d44
to
15a42ee
Compare
SUMMARY
An update of #12089, as this failure in error handling has been reflected in recent cases, and in user reports. This seems to account for the "Job terminated due to error" reports.
This modifies some of what was introduced in #12961, which is the source of the bug.
ISSUE TYPE
COMPONENT NAME