-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCP: Add'l memory sanitization fixes (was: SCP: Initialize de-dup'ed debug line buffer.) #327
Conversation
This change exists to accommodate a weakness in your analyzer tool and not a bug in the code. The contents of these buffers are only referenced based on other variables which manage the state of the buffer. |
Mark -- I cordially invite you to tell Google and Apple that their memory sanitizer tester sucks eggs. Must be something in the |
I didn't say it sucks eggs, you did. I'll leave any reporting up to you. By an equivalent logic, those buffers are growing as needed, and using precisely the same "precise analysis", why isn't it complaining about the region of those buffers that have grown, which is equally uninitialized. |
The sanitizer trips on the first |
cf27dda
to
59ba2d8
Compare
Squashed the two commits and re-pushed forcefully. |
So do the "sanitizer" complain about realloc with null input pointer, but not when the input pointer is non-null? In either case, any newly allocated space is uninitialized. |
@pkoning2: The sanitizers (address, memory, thread) all produce a traceback that tells you where the issue occurred. The sanitizer builds use My basic point: Clang's sanitizers have a good provenance, why not address the traceback outputs? I strongly doubt that Google's sanitizers produce many false positives, given the number of developers who use them. (Note: I didn't say "can't", "won't", or "don't".) The sanitizers have been part and parcel of Clang for a while (ca. 2012 for address) and there's a kernel address sanitizer that's regularly used on the Linux kernel. Wikipedia page here. Here's the culprit exercized (exorcised?) in the
The code, where I've intentionally set
|
No, it's not the "+ 1". |
The comparison that is happening there "might in theory" compare against uninitialized data if a second line being considered was longer than the first, BUT such a comparison would fail when the EOL character in the earlier line was compared against the character at the same offset in the new buffer and since the new line is longer than the prior line, the comparison would fail on that EOL compare against a non EOL character and nothing beyond the original line's EOL in the buffer would be referenced. The +1 in the comparison and in the buffer copy assure that the EOL gets compared and moved. Here the failure "theory" that could trigger the fault being raised is that all of the data in the memcmp() may not be initialized, but the logic in the code knows that the compare will stop without referencing uninitialized memory. |
I think the error in the memcmp is that it is comparing what is actually two strings, but as memory blocks using the length of the second string as the length to compare. One wonders why strcmp isn't used there, that should avoid the problem entirely. |
The Clang memory sanitizer detected the uninitialized space in Four solutions to the problem:
My preference is (3). It avoids the weird case of traipsing past |
Actually, it DOES NOT fail, it merely is flagged by the sanitizer you are trying to accommodate. As I said previously in precisely this case, the much smaller debug_line_buf_last has a newline at the end of that smaller range, while the current debug_line_buf doesn't have a newline until some 108 bytes out. The comparison will correctly return a non-zero value after touching at most the size of the previous data before (and including) the newline. Feel free to make any change you want so that the sanitizer is happy. The unmodified code is correct without any of your various changes that might make the sanitizer happy. |
What about my suggestion -- which is that strcmp is the obvious correct answer since we're comparing null-terminated strings. |
The current logic is precisely equivalent to strcmp() with a newline as the string terminator. The data in question is in a buffer that may be presented in a larger piece with multiple lines of data to be processed. Trying to use strcmp(), you'd have to modify the contents of the buffer to NUL terminate the strings and restore it after the test. As I said, since you're deciding on what's good to commit, feel free to make any changes needed to make the tool which triggers on a false positive here happy. Has this tool revealed any other actual issues in any simulator? |
But there is a null terminator in the buffer, too, right?
Still, I can see the point about this being a false positive, since memcmp should not read beyond the first difference. Perhaps the argument is that this isn't actually stated explicitly? If not, then approach (2) seems like a good one. |
The whole buffer may indeed be null terminated, but the relevant comparison in the memcmp that is relevant is the from the point specified up to and including the newline. Point 2's info is precisely determined by the newline in the buffer. Keeping track of that fact separately is at best a work around to accommodate the false positive in the tool, and that alone may not satisfy the tool's analysis since the buffer in question has may have uninitialized data beyond the useful data. I ask again: Has this tool reported any other actual problems? |
I can see why this code is failing, however I don't like the solution. The buffer will be initialized the first time it is used. Since there will be nothing in the original buffer. I would recommend an if (buffer == NULL) { buffer = (char *)malloc(size); *buffer = '\0'; } else { buffer = realloc(buffer, newsize); } |
Still, I can see the point about this being a false positive, since memcmp should not read beyond the first difference.
That's what I expected too, but I looked it up in the actual C standard
text (ok ok, in a draft version of C17 that ought to be substantially
identical to the paid-for version;
https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf
found via
https://stackoverflow.com/questions/81656/where-do-i-find-the-current-c-or-c-standard-documents
) and it says this:
7.24.4 Comparison functions
1 The sign of a nonzero value returned by the comparison functions memcmp,
strcmp, and strncmp is determined by the sign of the difference between the
values of the first pair of characters (both interpreted as unsigned char) that
differ in the objects being compared.
7.24.4.1 The memcmp function
Synopsis
1 #include <string.h>
int memcmp(const void *s1, const void *s2, size_t n);
Description
2 The memcmp function compares the first n characters of the object pointed to
by s1 to the first n characters of the object pointed to by s2. footnote 315)
Returns
3 The memcmp function returns an integer greater than, equal to, or less than
zero, accordingly as the object pointed to by s1 is greater than, equal to, or
less than the object pointed to by s2.
315) The contents of "holes" used as padding for purposes of alignment
within structure objects are indeterminate. Strings shorter than their
allocated space and unions may also cause problems in comparison.
Note how it mentions "the first n characters of", and doesn't say
anything about short-circuiting when a difference is found. The footnote
315 seems to reinforce this.
So my conclustion is now that the report from the tool is NOT a false
positive in the strict sense of the C standard, although many
implementations PROBABLY do a short-circuit. Although I can pretty
much guarantee that memcmp() would actually compare words at a time (or
even larger units) out of efficiency considerations, and therefore may
read past the initialized part of the buffer. NetBSD/vax certainly does:
cmpl (%r1)+,(%r2)+ # no "cmpq" alas
see
https://github.com/NetBSD/src/blob/trunk/lib/libc/arch/vax/string/memcmp.S
|
59ba2d8
to
3079168
Compare
Updated commit:
@Rhialto, @rcornwell: I took a quick look at the |
For reference, the inline |
Initialize de-dup'ed debug line buffer: realloc(NULL, size) == malloc(size), which is uninitialized space. This causes the Clang memory sanitizer to detect an attempt to read uninitialized memory when debug_line_buf and debug_line_buf_last are different lengths. While the uninitialized space may never actually be compared, the memory sanitizer emits a strong hint to not do stupid. The sanitizer trips in the i650 simulator on the first memcmp(), debug_line_buf has 108 characters, debug_line_buf_last has 56 characters (uninitialized space follows the 56 characters, tripping the sanitizer.) - memset() debug_line_buf and debug_line_buf_last to zero so that memcmp() will always gracefully return non-zero if somehow memcmp() ends up going past the end of either buffer. Should never happen in practice, but theory always gets mugged by reality. - Keep track of debug_line_buf_last's comparison length (i.e., up to the '\r') and only execute memcmp() when this length equals the current debug_line_buf comparison length (end - endprefix + 1). - Added a log deduplication test to "testlib" command to ensure that nothing broke as a result of this fix. Network ACL check in sim_addr_acl_check: The memory sanitizer found an off-by-one bug in sim_addr_acl_check while executing "testlib". This makes CIDR network ACLs functional, e.g., "127.0.0.1/32" is interpreted properly and the associated "testlib" test passes.
3079168
to
feb6f85
Compare
realloc(NULL, size) == malloc(size), which results in an uninitialized debug line buffer and will fail memory sanitizer checks. Ensure that the buffers are memset() to zero when initially allocated.