[Auditbeat] Cherry-pick #10942 to 7.x: User dataset: Numerous fixes to error handling #11191
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cherry-pick of PR #10942 to 7.x branch. Original message:
When testing the
user
dataset on Fedora 29, I noticed it's not working. Digging into it, there were a few issues, all of which this PR fixes:errno
is going to be zero on success (see here for some detail). Instead, we should check the return value first, and if that indicates that there might have been an error (usually when it'sNULL/nil
) we should check the error return.If one wants to check errno after the call, it should be set to zero before the call.
errno
cannot be accessed directly in Go code, so we introduce a helper functionsetErrno
for that.getspent(3)
does not have the same warning, but given that at least glibc uses the same code path for both, it's reasonable to assume the same applies. So we seterrno
to0
before calling both functions.errno
is thread-local, and the function familiessetpwent/endpwent/getpwent
andsetspent/endspent/getspent
are not thread-safe, so we introduceruntime.LockOSThread()/UnlockOSThread()
to make sure all C functions are run on the same OS thread.getpwent()
should returnNULL
anderrno
should be zero when all entries have been returned. However, there is a bug in systemd that causes it to seterrno
toENOENT
even when there is no error (nss-systemd's getpwent sets errno to ENOENT systemd/systemd#9585). This bug affects at least Fedora 29. Following this changeENOENT
is treated as if there is no error. It's not supposed to be a valid error value forgetpwent()
anyway, so should not happen anytime outside this bug.user
dataset to return no users, even though all users had been read successfully. This PR changes to gathering errors in multierrors and returning them alongside whatever data could be read. This is in line with wanting to make the System module more resilient to non-fatal failures during data collection (better to return some things alongside an error than no things at all).WARN/ERROR
messages in the log. This change can probably also be made for the other datasets.Writing up this PR description I'm realizing this combines quite a few issues. I could split it into multiple PRs if anybody wants, though most would have only a few lines of changed code. Because some indentation changed it's best to check Github's
No Whitespace
button.