Consume `job_explanation` from runner, fix error reporting error #13482

AlanCoding · 2023-01-27T22:22:14Z

SUMMARY

An update of #12089, as this failure in error handling has been reflected in recent cases, and in user reports. This seems to account for the "Job terminated due to error" reports.

This modifies some of what was introduced in #12961, which is the source of the bug.

ISSUE TYPE

Bug, Docs Fix or other nominal change

COMPONENT NAME

API

AlanCoding · 2023-01-27T22:24:09Z

This should go in along-side of ansible/ansible-runner#1186

AlanCoding · 2023-01-31T18:51:13Z

What I originally tested this against was the runner change:

@@ -68,6 +73,7 @@ class Worker(object):
             _output = sys.stdout.buffer
         self._input = _input
         self._output = _output
+        self._output.write(b'surprise from alan!\n')
 
         self.kwargs = kwargs
         self.job_kwargs = None

But I've realized there is a shortcoming for a different class of errors. Consider:

diff --git a/ansible_runner/streaming.py b/ansible_runner/streaming.py
index 608b972..df14bfc 100644
--- a/ansible_runner/streaming.py
+++ b/ansible_runner/streaming.py
@@ -68,6 +68,7 @@ class Worker(object):
             _output = sys.stdout.buffer
         self._input = _input
         self._output = _output
+        raise Exception('alan!')
 
         self.kwargs = kwargs
         self.job_kwargs = None

This results in a bad user experience:

But! As per some speculative data related to #13469, we know that receptor may not be fully functional. So we might need the readline data in some cases, and the full results (or truncated) results in other cases.

As I've stewed on this problem, I don't see a way out other than a special-case for the JSON parse error from the processing code. So I think I have an idea of how to move forward - which involves combining all of the data we can possibly manage, and a special string comparison for that specific message.

AlanCoding · 2023-01-31T21:27:48Z

Error handling is still very broken without #13494 fixed, so I'm not sure what I'm going to do now.

AlanCoding · 2023-01-31T21:29:03Z

But I did adjust my ansible-runner PR to gracefully handle the Exception case I mentioned before. I think both of these are good changes, but it's still hard to get quality output from error cases.

AlanCoding · 2023-01-31T21:30:30Z

awx/main/tasks/receptor.py

@@ -437,9 +437,9 @@ def _run_internal(self, receptor_ctl):
                        lines = resultfile.readlines()
                        receptor_output = b"".join(lines).decode()
                    if receptor_output:
-                        self.task.runner_callback.delay_update(result_traceback=receptor_output)
+                        self.task.runner_callback.delay_update(result_traceback=f'Worker output:\n{receptor_output}')


My plan here, now, is that ansible-runner will write the line to job_explanation. By doing that, we might echo the output twice in two places. Because of that, we need some better labeling of what data we're showing, so I added this stuff.

AlanCoding · 2023-02-02T18:32:49Z

awx/main/tasks/receptor.py

@@ -437,9 +437,9 @@ def _run_internal(self, receptor_ctl):
                        lines = resultfile.readlines()


recent discussion is that I might delete resultsock.setblocking(False)

AlanCoding · 2023-03-02T19:55:35Z

I added tests to my related ansible-runner PR. This is one of the highest priority items I have to make sure we get in, but I don't want to merge it until the runner change lands (because don't know what the UX of that would be).

AlanCoding · 2023-04-03T16:31:22Z

I have test cases up here:

https://github.com/AlanCoding/bad-execution-environments

Let me review those...

The "traceback" test is not tested right now due to linked receptor issue ansible/receptor#736

ending_line

starting_line

artifacts

awx/main/tasks/receptor.py

Add extra formatting to error messages for clarity Fix nonblocking related tracebacks, add logging

…ible#13482)

github-actions bot added the component:api label Jan 27, 2023

fosterseth approved these changes Jan 27, 2023

View reviewed changes

AlanCoding commented Jan 31, 2023

View reviewed changes

AlanCoding commented Feb 2, 2023

View reviewed changes

AlanCoding force-pushed the non_json_error branch from 5f27f77 to 10567b3 Compare February 9, 2023 18:42

AlanCoding force-pushed the non_json_error branch 2 times, most recently from a001a96 to 45c3244 Compare April 3, 2023 15:22

AlanCoding requested review from TheRealHaoLiu and gamuniz April 3, 2023 16:31

gamuniz reviewed Apr 3, 2023

View reviewed changes

awx/main/tasks/receptor.py Show resolved Hide resolved

gamuniz approved these changes Apr 3, 2023

View reviewed changes

AlanCoding self-assigned this Apr 13, 2023

AlanCoding force-pushed the non_json_error branch from 45c3244 to 6780bfe Compare June 5, 2023 17:12

AlanCoding changed the title ~~Consume job_explanation when given by ansible-runner~~ Consume job_explanation from runner, fix error reporting error Jun 6, 2023

AlanCoding changed the title ~~Consume job_explanation from runner, fix error reporting error~~ Consume job_explanation from runner, fix error reporting error Jun 6, 2023

AlanCoding force-pushed the non_json_error branch from 6780bfe to 1569d44 Compare August 1, 2023 14:12

Consume job_explanation when given by ansible-runner

15a42ee

Add extra formatting to error messages for clarity Fix nonblocking related tracebacks, add logging

AlanCoding force-pushed the non_json_error branch from 1569d44 to 15a42ee Compare August 23, 2023 19:01

AlanCoding merged commit cdb4f0b into ansible:devel Aug 30, 2023

djyasin pushed a commit to djyasin/awx that referenced this pull request Sep 16, 2024

Consume job_explanation from runner, fix error reporting error (ans…

77a5826

…ible#13482)

djyasin pushed a commit to djyasin/awx that referenced this pull request Nov 11, 2024

Consume job_explanation from runner, fix error reporting error (ans…

3d3d6b0

…ible#13482)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consume `job_explanation` from runner, fix error reporting error #13482

Consume `job_explanation` from runner, fix error reporting error #13482

AlanCoding commented Jan 27, 2023 •

edited

Loading

AlanCoding commented Jan 27, 2023

AlanCoding commented Jan 31, 2023

AlanCoding commented Jan 31, 2023

AlanCoding commented Jan 31, 2023

AlanCoding Jan 31, 2023

AlanCoding Feb 2, 2023

AlanCoding commented Mar 2, 2023

AlanCoding commented Apr 3, 2023

		@@ -437,9 +437,9 @@ def _run_internal(self, receptor_ctl):
		lines = resultfile.readlines()

Consume job_explanation from runner, fix error reporting error #13482

Consume job_explanation from runner, fix error reporting error #13482

Conversation

AlanCoding commented Jan 27, 2023 • edited Loading

SUMMARY

ISSUE TYPE

COMPONENT NAME

AlanCoding commented Jan 27, 2023

AlanCoding commented Jan 31, 2023

AlanCoding commented Jan 31, 2023

AlanCoding commented Jan 31, 2023

AlanCoding Jan 31, 2023

Choose a reason for hiding this comment

AlanCoding Feb 2, 2023

Choose a reason for hiding this comment

AlanCoding commented Mar 2, 2023

AlanCoding commented Apr 3, 2023

ending_line

starting_line

artifacts

Consume `job_explanation` from runner, fix error reporting error #13482

Consume `job_explanation` from runner, fix error reporting error #13482

AlanCoding commented Jan 27, 2023 •

edited

Loading