-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[analyzer] First batch of patches for the Juliet benchmark for taint improvements #66074
Conversation
@llvm/pr-subscribers-clang-static-analyzer-1 ChangesSee the motivation here: https://discourse.llvm.org/t/patches-inspired-by-the-juliet-benchmark/73106 I've checked all these 4 commits on a large set of projects, and they - surprisingly - don't show any report differences besides noise.
|
@llvm/pr-subscribers-clang ChangesSee the motivation here: https://discourse.llvm.org/t/patches-inspired-by-the-juliet-benchmark/73106 I've checked all these 4 commits on a large set of projects, and they - surprisingly - don't show any report differences besides noise.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are small and straightforward improvements, they mostly LGTM.
Note that in the commit message of "[analyzer] Fix stdin declaration in C++ tests" the first line contains a typo ("shoulkd").
On the commit "[analyzer] Fix taint sink rules for exec-like functions" I think that the old behavior might be intentional: it reports situations when malicious actors may influence which executable will be ran, but does not report situations when the arguments passed to a (presumably trusted) executable contain user input.
Of course, there are situations where tainted arguments are also very problematic (e.g. calling sudo
, sh
or similar "execute something" programs), and I can accept this as a policy change, but I wouldn't call it a "Fix" of outright buggy behavior.
On the meta level, I'd also remark that this "four commits in one batch" review is cumbersome (at least for me -- I'm not accustomed to github gui), and I'd prefer to see separate pull requests for unrelated commits if that's not a serious problem for you.
In summary, I'd say that:
- [analyzer] Make socket accept() propagate taint and [analyzer] Propagate taint for wchar variants of some APIs definitely LGTM; feel free to push them separately.
- On [analyzer] Fix stdin declaration in C++ tests I have an inline question but the change is acceptable as it's now.
- On [analyzer] Fix taint sink rules for exec-like functions I'd wait a bit to see the opinion of others about handling the
exec...
functions in a stricter manner.
I'll check for typos. Thanks. I should really set up grammarly for vim - where I mostly write my commit messages.
I'd lean towards that this was just a mistake in the past.
I feel you. However, given that the commits touch the same file(s) (checker & test) it would pose challenges to apply the diffs out-of-order, because they likely would conflict in the git diff contexts.
I'll split off the |
The `stdin` declaration should be within `extern "C" {...}`, in C++ mode. In addition, it should be also marked `extern` in both C and C++ modes. I tightened the check to ensure we only accept `stdin` if both of these match. However, from the Juliet test suite's perspective, this commit should not matter.
This allows to track taint on real code from `socket()` to reading into a buffer using `recv()`.
Functions like `fgets`, `strlen`, `strcat` propagate taint. However, their `wchar_t` variants don't. This patch fixes that. Notice, that there could be many more APIs missing. This patch intends to fix those that so far surfaced, instead of exhaustively fixing this issue.
63b6a37
to
8b80d80
Compare
Sorry about the force push, but it was necessary to split off the
I'll re-run the measurement to ensure everything is all right. |
Yup, no report changes. Confirmed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The `stdin` declaration should be within `extern "C" {...}`, in C++ mode. In addition, it should be also marked `extern` in both C and C++ modes. I tightened the check to ensure we only accept `stdin` if both of these match. However, from the Juliet test suite's perspective, this commit should not matter. #66074
This allows to track taint on real code from `socket()` to reading into a buffer using `recv()`. #66074
Functions like `fgets`, `strlen`, `strcat` propagate taint. However, their `wchar_t` variants don't. This patch fixes that. Notice, that there could be many more APIs missing. This patch intends to fix those that so far surfaced, instead of exhaustively fixing this issue. #66074
Merged manually, to allow landing these commits separately. |
Fix build bot: https://lab.llvm.org/buildbot/#/builders/139/builds/49699 clang/test/Analysis/taint-generic.c: ``` Line 100: redefinition of typedef 'size_t' is a C11 feature Line 59: previous definition is here ``` This commit fixups 61924da Committed in this PR: #66074
The `stdin` declaration should be within `extern "C" {...}`, in C++ mode. In addition, it should be also marked `extern` in both C and C++ modes. I tightened the check to ensure we only accept `stdin` if both of these match. However, from the Juliet test suite's perspective, this commit should not matter. llvm#66074
This allows to track taint on real code from `socket()` to reading into a buffer using `recv()`. llvm#66074
Functions like `fgets`, `strlen`, `strcat` propagate taint. However, their `wchar_t` variants don't. This patch fixes that. Notice, that there could be many more APIs missing. This patch intends to fix those that so far surfaced, instead of exhaustively fixing this issue. llvm#66074
Fix build bot: https://lab.llvm.org/buildbot/#/builders/139/builds/49699 clang/test/Analysis/taint-generic.c: ``` Line 100: redefinition of typedef 'size_t' is a C11 feature Line 59: previous definition is here ``` This commit fixups 61924da Committed in this PR: llvm#66074
The `stdin` declaration should be within `extern "C" {...}`, in C++ mode. In addition, it should be also marked `extern` in both C and C++ modes. I tightened the check to ensure we only accept `stdin` if both of these match. However, from the Juliet test suite's perspective, this commit should not matter. llvm#66074
This allows to track taint on real code from `socket()` to reading into a buffer using `recv()`. llvm#66074
Functions like `fgets`, `strlen`, `strcat` propagate taint. However, their `wchar_t` variants don't. This patch fixes that. Notice, that there could be many more APIs missing. This patch intends to fix those that so far surfaced, instead of exhaustively fixing this issue. llvm#66074
Fix build bot: https://lab.llvm.org/buildbot/#/builders/139/builds/49699 clang/test/Analysis/taint-generic.c: ``` Line 100: redefinition of typedef 'size_t' is a C11 feature Line 59: previous definition is here ``` This commit fixups 61924da Committed in this PR: llvm#66074
See the motivation here: https://discourse.llvm.org/t/patches-inspired-by-the-juliet-benchmark/73106
I've checked all these 3 commits on a large set of projects, and they - surprisingly - don't show any report differences besides noise. (EDIT: I'll recheck the measurement, to be sure.)
I plan to land these 3 commits individually (aka. I don't plan to squash them).