-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.6] Reduce performance regression in RSA public operations #9281
Conversation
Attempt to partially solve the performance regression in 3.6.0 without adding too much code size. Signed-off-by: Manuel Pégourié-Gonnard <[email protected]>
Initial performance measurements on my laptop (64-bit Intel): 3.6.0
This PR
3.5.1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks good to me, this should eliminate all the overhead. One opinion about the API: maybe it would be worth to keep the _optionally_safe()
function static and have an _unsafe()
version in the header instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok (with a few minor remarks) if we really have to have mixed-leakiness functions, but do we really have to do that? I find it harder to follow the code and reason about the function, and that's with a single function, I fear it won't scale to more complex algorithms.
include/mbedtls/bignum.h
Outdated
* } | ||
* not the other way round, in order to prevent misuse. (This is, if a value | ||
* other than the two below is passed, default to the safe path.) */ | ||
#define MBEDTLS_MPI_IS_PUBLIC 0x2a2a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some constants can be loaded with fewer instructions than others. We should pick a constant that's short to load on Thumb. I think 0x2a2a can be improved on. This page has some guidance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x2a2a is a 16 bit value and can be loaded with a movw
. That supposed to be a single cycle, how can we improve on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanesca That's new in v7. And on second thoughts that blog post only covers 32-bit Arm instructions, but when thinking of code size, we care about Thumb instructions. So this is more relevant:
In Thumb instructions, constant can be:
- any constant that can be produced by shifting an 8-bit value left by any number of bits within a 32-bit word
- any constant of the form 0x00XY00XY
- any constant of the form 0xXY00XY00
- any constant of the form 0xXYXYXYXY.
That covers 0x002a002a and 0x2a2a2a2a and 0x0000002a and 0x00002a00 and more but not 0x00002a2a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm that for Thumb2 the comparisons are encoded as single instructions, e.g.
16: f1b1 3f2a cmp.w r1, #707406378 ; 0x2a2a2a2a
include/mbedtls/bignum.h
Outdated
* \return Another negative error code on different kinds of failures. | ||
* | ||
*/ | ||
int mbedtls_mpi_exp_mod_optionally_safe(mbedtls_mpi *X, const mbedtls_mpi *A, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to have a single function for both cases, I would prefer a more specific naming convention than optionally_safe
. I don't like the use of “safe”, which could refer to many different things. I'd prefer leaky or “public”.
library/bignum_core.h
Outdated
* It is up to the caller to zeroize \p T when it is no | ||
* longer needed, and before freeing it if it was dynamically | ||
* allocated. | ||
* \param[in] E_public Set to MBEDTLS_MPI_IS_PUBLIC to gain some performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the complexity of having functions whose security properties depend on a runtime argument. Neither P256-m nor BearSSL do that. I would prefer to have separate code paths for leaky and non-leaky, converging only on functions that have a single implementation either way. The only argument I can think of to merge the two is code size. Is the gain really worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not happy about it either, but I think now that we are turning on USE_PSA_CRYPTO by default, we will be pretty desperate for code size savings for a while. I think the gain can worth it and we should try.
No, I don't think we absolutely have to do that, and that's one of the main points about which I was hoping to get feedback. I just has to make a temporary choice for prototyping.
Yes, I think that would already be better.
Agreed. Do you think it's acceptable for
That, and also there can be a maintenance cost if we end up duplicating source code. I can give it a try to do some size measurements and check how easy or awkward it is to avoid duplication. |
I'd prefer to avoid it, but for smaller functions with a limited scope, it's not so bad. |
With this PR (and the necessary changes to make use of the
|
Just checking: @gilles-peskine-arm @yanesca will you be volunteering to review this once I've updated it, or should I look for another reviewer? (I'd appreciate if at least one of you could give at least a design review.) |
I intend to review if I'm around (I'll take about a week off around 15 August). |
I am happy to review it as well, but I will be taking some days off too (this Friday and next Wednesday). |
Signed-off-by: Janos Follath <[email protected]>
The complexity of having functions whose security properties depend on a runtime argument can be dangerous. Limit misuse by making any such functions local. Signed-off-by: Janos Follath <[email protected]>
The complexity of having functions whose security properties depend on a runtime argument can be dangerous. Limit risk by isolating such code in small functions with limited scope. Signed-off-by: Janos Follath <[email protected]>
These macros are not part of any public or internal API, ideally they would be defined in the source files. The reason to put them in bignum_core.h to avoid duplication as macros for this purpose are needed in both bignum.c and bignum_core.c. Signed-off-by: Janos Follath <[email protected]>
In Thumb instructions, constant can be: - any constant that can be produced by shifting an 8-bit value left by any number of bits within a 32-bit word - any constant of the form 0x00XY00XY - any constant of the form 0xXY00XY00 - any constant of the form 0xXYXYXYXY. Signed-off-by: Janos Follath <[email protected]>
There is still work needed, just pushing what I have so far to get an early CI overnight. |
It is easier to read if the parameter controlling constant timeness with respect to a parameter is next to that parameter. Signed-off-by: Janos Follath <[email protected]>
The allocated size can be significantly larger than the actual size. In the unsafe case we can use the actual size and gain some performance. Signed-off-by: Janos Follath <[email protected]>
The new test hooks allow to check whether there was an unsafe call of an optionally safe function in the codepath. For the sake of simplicity the MBEDTLS_MPI_IS_* macros are reused for signalling safe/unsafe codepaths here too. Signed-off-by: Janos Follath <[email protected]>
Signed-off-by: Janos Follath <[email protected]>
Signed-off-by: Janos Follath <[email protected]>
Reverted to 878af12. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve the library changes for the release freeze today, after fixing the CI (plus the changelog entry while you're at it). I haven't reviewed the tests, but I'm satisfied that they're non-harmful.
This is conditional on doing the following as soon as possible, in any case before the end of the quarter:
- Have the tests reviewed and approved (by me or someone else).
- Improving the test hooks
- Make initialization robust
Co-authored-by: Gilles Peskine <[email protected]> Signed-off-by: Janos Follath <[email protected]>
To silence no previous prototype warnings. And this is the proper way to do it anyway. Signed-off-by: Janos Follath <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve the library changes at 5f31697 for the release freeze today. I haven't reviewed the tests, but I'm satisfied that they're non-harmful.
This is conditional on doing the following as soon as possible, in any case before the end of the quarter:
- Have the tests reviewed and approved (by me or someone else).
- Improving the test hooks
- Make initialization robust
Signed-off-by: Janos Follath <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the reservations in #9281 (review)
I've pressed the merge button so that we can get on with the release. We need a forward-port to |
df0ef8a
Description
Fix #9232 - partially, the low-hanging fruits (also those that don't make the code size go up by too much).
Status: work in progress.
TODO:
PR checklist