-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SECLIB-667: Accelerate SHA-512 with A64 crypto extensions #5632
SECLIB-667: Accelerate SHA-512 with A64 crypto extensions #5632
Conversation
59bd283
to
f284765
Compare
uint32x4_t sched1 = vld1q_u32( (const uint32_t *)( msg + 16 * 1 ) ); | ||
uint32x4_t sched2 = vld1q_u32( (const uint32_t *)( msg + 16 * 2 ) ); | ||
uint32x4_t sched3 = vld1q_u32( (const uint32_t *)( msg + 16 * 3 ) ); | ||
uint32x4_t sched0 = (uint32x4_t) vld1q_u8( msg + 16 * 0 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing it this way avoids increases required alignment
when compiled with -Wcast-align
For the record, yes, I confirm that I did give permission for that. Glad it's useful! |
6132fa0
to
2785928
Compare
For those wondering: my latest push removed a commit that added a |
Provide an additional pair of #defines, MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT and MBEDTLS_SHA512_USE_A64_CRYPTO_ONLY. At most one of them may be specified. If used, it is necessary to compile with -march=armv8.2-a+sha3. The MBEDTLS_SHA512_PROCESS_ALT and MBEDTLS_SHA512_ALT mechanisms continue to work, and are mutually exclusive with SHA512_USE_A64_CRYPTO. There should be minimal code size impact if no A64_CRYPTO option is set. The SHA-512 implementation was originally written by Simon Tatham for PuTTY, under the MIT licence; dual-licensed as Apache 2 with his kind permission. Signed-off-by: Tom Cosgrove <[email protected]>
Signed-off-by: Tom Cosgrove <[email protected]>
Was getting 'dd: unknown operand status' Signed-off-by: Tom Cosgrove <[email protected]>
2785928
to
226aca1
Compare
Minor update to the detection of which (Also rebased onto latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed this in the same manner as the SHA-256 PR (#5547).
My review has focused on the following areas:
Style and naming conventions --> all good
Self test --> passes
Unit tests --> passes
Compilation with different combinations of new config options --> Works as expected
Performance gains --> I compiled and ran on an Aarch64 machine and can report that the performance gain is around a 4.4x increase (for both new config options) when compared to not using the CE, which is worthwhile.
LGTM.
P.S. There were a couple of (IMO) missing full stops on some of the multi-line comments and I have highlighted this in my comments. This is so minor that I am not going to let it halt approval and I leave it to @tom-cosgrove-arm to fix if deemed worth it.
* an illegal instruction fault will occur). | ||
* | ||
* \note This allows builds with a smaller code size than with | ||
* MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing full stop. (To keep it consistent with how you have written the rest of the comments.)
#if defined(MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT) | ||
/* | ||
* Capability detection code comes early, so we can disable | ||
* MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT if no detection mechanism found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing full stop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes #5029 |
Description
Provide an additional pair of #defines,
MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT
andMBEDTLS_SHA512_USE_A64_CRYPTO_ONLY
. At most one of them may be specified. If used, it is necessary to compile with-march=armv8.2-a+sha3
.The
MBEDTLS_SHA512_PROCESS_ALT
andMBEDTLS_SHA512_ALT
mechanisms continue to work, and are mutually exclusive withA64_CRYPTO
.There should be minimal code size impact if no A64_CRYPTO option is set.
Note that the Neon-accelerated implementation was originally written by Simon Tatham for PuTTY, under the MIT licence; it is dual-licensed as Apache 2 with his kind permission. I have restructured it to match the Mbed TLS implementation; any mistakes are mine, not his.
Status
READY
Requires Backporting
NO
Additional comments
The coding standards say "a preferred maximum line length of 80 characters" rather than "a maximum line length of 80 characters" - in a very few places I had exceeded 80 characters by a small amount to improve readability (especially where a sequence of consecutive lines are similar).
Steps to test or reproduce
Define either
MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT
orMBEDTLS_SHA512_USE_A64_CRYPTO_ONLY
in the configuration, and compile with-march=armv8.2-a+sha3
. *IF_PRESENT will work on systems (such as Raspberry Pi) that don't have the crypto extensions; *ONLY will work on most (other) Aarch64 systems.