-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf, verifier: improve signed ranges inference for BPF_AND #8090
Closed
shunghsiyu
wants to merge
3
commits into
kernel-patches:bpf-next_base
from
shunghsiyu:bpf-next/bitwise-and-signed-ranges-v2
Closed
bpf, verifier: improve signed ranges inference for BPF_AND #8090
shunghsiyu
wants to merge
3
commits into
kernel-patches:bpf-next_base
from
shunghsiyu:bpf-next/bitwise-and-signed-ranges-v2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improve BPF verifier's inference of signed ranges. Deduce signed ranges directly from signed ranges of the operands by doing dst_reg->smin_value = negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)) dst_reg->smax_value = max(dst_reg->smax_value, src_reg->smax_value) See below for the complete explanation. The improvement is needed to prevent verifier rejection of BPF program like the one presented by Xu Kuohai: SEC("lsm/bpf_map") int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) { if (map != (struct bpf_map *)&data_input) return 0; if (fmode & FMODE_WRITE) return -EACCES; return 0; } Where the relevant verifer log upon rejection are: ... 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) 9: (95) exit This sequence of instructions comes from Clang's transformation located in DAGCombiner::SimplifySelectCC() method, which combined the "fmode & FMODE_WRITE" check with the return statement without needing BPF_JMP at all. See Eduard's comment for more detail of this transformation[0]. While the verifier can correctly infer that the value of r0 is in a tight [-1, 0] range after instruction "r0 s>>= 63", is was not able to come up with a tight range for "r0 &= -13" (which would be [-13, 0]), and instead inferred a very loose range [S64_MIN, -13]: r0 s>>= 63; R0_w=scalar(smin=smin32=-1,smax=smax32=0) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) The reason is that scalar*_min_max_add() mainly relies on tnum for inferring bounds in register after BPF_AND, however [-1, 0] cannot be tracked precisely with tnum, and was effectively turns into [0, -1] (i.e. tnum_unknown). So upon BPF_AND the resulting tnum is equivalent to dst_reg->var_off = tnum_and(tnum_unknown, tnum_const(-13)) And from there the BPF verifier was only able to infer smin=S64_MIN and smax=0x7ffffffffffffff3, which is outside of the allowed [-4095, 0] range for return values, and thus the program was rejected. To allow verification of such instruction pattern, update scalar*_min_max_and() to infer signed ranges directly from signed ranges of the operands. For BPF_AND, the resulting value always gains more unset '0' bit, thus it only move towards 0x0000000000000000. The difficulty lies with how to deal with signs. While non-negative (positive and zero) value simply grows smaller, a negative number can grows smaller, but may also "underflow" and become a larger value. To better address this situation we split the signed ranges into negative range and non-negative range cases, ignoring the mixed sign cases for now; and only consider how to calculate smax_value. Since negative range & negative range preserve the sign bit, so we know the result is still a negative value, thus it only move towards S64_MIN, but never underflow, thus a save bet for smax is a value in ranges that is closest to 0, thus "max(dst_reg->smax_value, src->smax_value)". For negative range & non-negative range the sign bit is always cleared, thus we know the resulting value is non-negative, and only moves towards 0, so a safe bet for smax is the smax_value of the non-negative range. Last but not least, non-negative range & non-negative range is still a non-negative value, and only moves towards 0; same as the unsigned range case, the maximum is actually capped by the lesser of the two, and thus min(dst_reg->smax_value, src_reg->smax_value); Listing out the above reasoning as a table (dst_reg abbreviated as dst, src_reg abbreviated as src, smax_value abbrivated as smax) we get: | src_reg smax = ? +---------------------------+--------------------------- | negative | non-negative ---------+--------------+---------------------------+--------------------------- | negative | max(dst->smax, src->smax) | src->smax dst_reg +--------------+---------------------------+--------------------------- | non-negative | dst->smax | min(dst->smax, src->smax) However this is quite complicated, and could use some simplification given the following observations: max(dst_reg->smax_value, src_reg->smax_value) >= src_reg->smax_value max(dst_reg->smax_value, src_reg->smax_value) >= dst_reg->smax_value max(dst_reg->smax_value, src_reg->smax_value) >= min(dst_reg->smax_value, src_reg->smax_value) So we could do a safe substitution replacing all cells of the table above with max(...), and arrive at: | src_reg smax' = ? +---------------------------+--------------------------- smax'(r) >= smax(r) | negative | non-negative ---------+--------------+---------------------------+--------------------------- | negative | max(dst->smax, src->smax) | max(dst->smax, src->smax) dst_reg +--------------+---------------------------+--------------------------- | non-negative | max(dst->smax, src->smax) | max(dst->smax, src->smax) Meaning that simply using max(dst_reg->smax_value, src_reg->smax_value) to calculate the resulting smax_value would work across all sign combinations (while a bit imprecise in certain cases). For smin_value, we know that both non-negative range & non-negative range and negative range & non-negative range both result in a non-negative value, so an easy guess is to use the minimum value in non-negative range, thus 0. | src_reg smin = ? +----------------------------+--------------------------- | negative | non-negative ---------+--------------+----------------------------+--------------------------- | negative | ? | 0 dst_reg +--------------+----------------------------+--------------------------- | non-negative | 0 | 0 That leaves the negative range & negative range case to be considered. We know that negative range & negative range always yield a negative value, so a preliminary guess would be S64_MIN. However, that guess is too imprecise to help with the r0 <<= 62, r0 s>>= 63, r0 &= -13 pattern we're trying to deal with here. Further improvement comes with the observation that for negative range & negative range, the smallest possible value must be one that has longest _common_ most-significant set '1' bits sequence, thus we can use min(dst_reg->smin_value, src->smin_value) as the starting point, as the smaller value will be the one with the shorter most-significant set '1' bits sequence. But that alone is not enough, as we do not know whether rest of the bits would be set, so the safest guess would be one that clear alls bits after the most-significant set '1' bits sequence, something akin to bit_floor(), but for rounding to a negative power-of-2 instead. negative_bit_floor(0xffff000000000003) == 0xffff000000000000 negative_bit_floor(0xfffffb0000000000) == 0xfffff80000000000 negative_bit_floor(0xffffffffffffffff) == 0xffffffffffffffff /* -1 remains unchanged */ negative_bit_floor(0x0000fb0000000000) == 0x0000000000000000 /* non-negative values became 0 */ With negative range & negative range solve, we now have: | src_reg smin = ? +----------------------------+--------------------------- | negative | non-negative ---------+--------------+----------------------------+--------------------------- | negative |negative_bit_floor( | 0 | | min(dst->smin, src->smin))| dst_reg +--------------+----------------------------+--------------------------- | non-negative | 0 | 0 This can be also simplified with some observations. (Quadrants refers to the cells in above table, number start from top-right cell -- I, and goes counter-clockwise): A. min(dst_reg->smin_value, src_reg->smin_value) < 0 /* dst negative & src non-negative, quadrant I */ B. min(dst_reg->smin_value, src_reg->smin_value) < 0 /* dst non-negative & src negative, quadrant III */ C. min(dst_reg->smin_value, src_reg->smin_value) >= 0 /* dst non-negative & src non-negative, quadrant IV */ D. negative_bit_floor(x) s<= x /* for any x, negative_bit_floor(x) is always smaller (or equal to the original value) */ E. negative_bit_floor(y) == 0 /* when y is non-negative, i.e. y >= 0, since the most-significant is unset, so every bit is unset */ Thus we can derive negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)) < 0 /* combine A and D, where dst negative & src non-negative */ negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)) < 0 /* combine B and D, where dst non-negative & src negative */ negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)) == 0 /* combine C and E, where dst non-negative & src non-negative */ Substitute quadrants I, III, and IV cells in the table above all with negative_bit_floor(min(...)), we arrive at: | src_reg smin' = ? +----------------------------+--------------------------- smin'(r) <= smin(r) | negative | non-negative ---------+--------------+----------------------------+--------------------------- | negative |negative_bit_floor( |negative_bit_floor( | | min(dst->smin, src->smin))| min(dst->smin, src->smin)) dst_reg +--------------+----------------------------+--------------------------- | non-negative |negative_bit_floor( |negative_bit_floor( | | min(dst->smin, src->smin))| min(dst->smin, src->smin)) Meaning that simply using negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)) to calculate the resulting smin_value would work across all sign combinations. Together these allows us to infer the signed range of the result of BPF_AND operation using the signed range from its operands. While this also seemingly removes the propagation of unsigned ranges to signed ranges, such propagation still happens later within __reg_deduce_bounds(), so there should be no loss of bound information. Also, revert changes introduced in commit 229d6db ("selftests/bpf: Workaround strict bpf_lsm return value check.") previously added to workaround such verifier rejection. [0] https://lore.kernel.org/bpf/[email protected]/ Link: https://lore.kernel.org/bpf/phcqmyzeqrsfzy7sb4rwpluc37hxyz7rcajk2bqw6cjk2x7rt5@m2hl6enudv7d/ Acked-by: Eduard Zingerman <[email protected]> Reviewed-by: Edward Cree <[email protected]> Tested-by: Harishankar Vishwanathan <[email protected]> Signed-off-by: Shung-Hsi Yu <[email protected]>
Add back 'test 3' from Xu Kuohai, which was previously removed because BPF verifier signed range deduction was not precise enough. With last patch in this series applied this test no longer fails. Link: https://lore.kernel.org/bpf/CAADnVQJ2bE0cAp8DNh1m6VqphNvWLkq8p=gwyPbbcdopaKcCCA@mail.gmail.com/ Signed-off-by: Xu Kuohai <[email protected]> Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
force-pushed
the
bpf-next/bitwise-and-signed-ranges-v2
branch
from
November 18, 2024 12:19
7c8075d
to
bb5b470
Compare
kernel-patches-daemon-bpf
bot
force-pushed
the
bpf-next_base
branch
from
November 19, 2024 02:21
3113082
to
6c2b6b1
Compare
shunghsiyu
force-pushed
the
bpf-next/bitwise-and-signed-ranges-v2
branch
10 times, most recently
from
November 19, 2024 08:06
b5921d8
to
39c384f
Compare
…BPF_AND WIP Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
force-pushed
the
bpf-next/bitwise-and-signed-ranges-v2
branch
from
November 19, 2024 08:31
39c384f
to
c0def4a
Compare
kernel-patches-daemon-bpf
bot
force-pushed
the
bpf-next_base
branch
11 times, most recently
from
November 25, 2024 22:37
ec2c148
to
abdc8e6
Compare
Automatically cleaning up stale PR; feel free to reopen if needed |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.