-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logging to TimeoutHandler #1335
Conversation
Hi @fraenkel. Thanks for your PR. I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
The following is the coverage report on pkg/.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fraenkel SGTM but needs a rebase
The reconcile tests were causing panics or tripping race detection because the timeout handler was using the *testing.T methods after the test was already marked done. The solution is to use loggers that are not tied to the testing framework. The downside is that you may see logs from prior tests intermixed but that is usually just the one associated with stopping the timeout.
The following is the coverage report on pkg/.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vdemeester The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hey @fraenkel , I'm super late to the party here but if you happen to remember:
Are you sure that the problem is specific to the testing logger? I am under the impression that this race condition would happen in normal controller execution as well, since the same logger would be used across goroutines as well. |
The original code used the zaptest Logger which wrote to t.Logf and may call t.Fail. |
Hey @fraenkel thanks for the update! I think that explains why the tests would catch a race condition; however I'm wondering if working around the race condition in the test logging functions means we are now hiding an existing race condition. When the timeout handler is used in the controller, it is passed in a logger object such that I believe every go routine invoked by the timeout handler will be using the same logger: This is where the logger is passed in:
From the context via: So while updating the tests means there is no race condition in the tests, im concerned that in the actual controller, we're still using a non-thread safe logger object and writing to it across mulitple go routines - and further by removing the possibility of having a race in the tests, i think we might be covering that up? |
@bobcatfish I wouldn't say we are hiding a race. Most loggers are thread safe. t.Log however is a special purpose logger that is tied to a different lifecycle. If you really want to use a testing.T underneath the logger, you need to make a stronger gesture to stop all users of the logger before the test completes. This is a pretty tall task but could be valuable. |
Hey @fraenkel, unfortunately as described in #1307 the logger we are using isn't threadsafe :( uber-go/zap#99 as you can see in the example there, this actually results in interleaved logs when the same logger is used across goroutines; and afaik, when the controller actually runs, it will be using the same logger across goroutines. |
The zap logger is safe for concurrent use, https://godoc.org/go.uber.org/zap#Logger. |
Looking into the issue I linked above, it looks like the zap logger used to not be threadsafe, but that was fixed ages ago (2017?) and we've been using versions that included this fix all along. Thanks for the explanation @fraenkel |
Fixes #1307
Changes
The reconcile tests were causing panics or tripping race detection
because the timeout handler was using the *testing.T methods after the
test was already marked done.
The solution is to use loggers that are not tied to the testing
framework. The downside is that you may see logs from prior tests
intermixed but that is usually just the one associated with stopping the
timeout.
Submitter Checklist
These are the criteria that every PR should meet, please check them off as you
review them:
See the contribution guide for more details.
Double check this list of stuff that's easy to miss:
cmd
dir, please updatethe release Task and TaskRun to build and release this image
Reviewer Notes
If API changes
are included, additive changes
must be approved by at least two OWNERS
and backwards incompatible changes
must be approved by more than 50% of the OWNERS,
and they must first be added
in a backwards compatible way.