Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chacha20: AVX2 detection #200

Merged
merged 1 commit into from
Dec 10, 2020
Merged

chacha20: AVX2 detection #200

merged 1 commit into from
Dec 10, 2020

Conversation

tarcieri
Copy link
Member

Automatically detects the availability of AVX2 on x86(_64) architectures based on CPUID, and falls back to the SSE2" backend if unavailable.

@tarcieri tarcieri requested a review from newpavlov December 10, 2020 17:33
@tarcieri tarcieri force-pushed the chacha20/avx2-detection branch from c8a52b7 to 94a31e2 Compare December 10, 2020 17:42
Copy link
Member

@newpavlov newpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this code is correct. You can not assume that SSE2 is available without checking it either via cfgs or via runtime detection. It should be quite easy to add a second token which would check SSE2 feature and will use the software fallback if it's not available for some reason. For most builds it will be replaced by true, so compiler should remove the fallback code for them. Ideally we would use "CPUID match", but it's not implemented yet, so we have no choice for now.

@tarcieri tarcieri force-pushed the chacha20/avx2-detection branch from 94a31e2 to 1767c76 Compare December 10, 2020 17:57
@tarcieri
Copy link
Member Author

tarcieri commented Dec 10, 2020

@newpavlov I'm trying to find a reference for it, but I believe the sse2 target feature is enabled-by-default on x86(_64) targets.

Otherwise I suppose I can gate on it.

Edit: added explicit target_feature gating in 220e346

@tarcieri tarcieri force-pushed the chacha20/avx2-detection branch from 1767c76 to 220e346 Compare December 10, 2020 18:00
@newpavlov
Copy link
Member

newpavlov commented Dec 10, 2020

Yes, it is, thus my comment about second token being replaced by true at compile time for most builds. But nothing prevents users from using target-feature=-sse2.

UPD: Yeah, adding the sse2 gating is another approach, though I prefer the two-token one. But it does not have to be added in this PR. I may add it later.

@tarcieri
Copy link
Member Author

tarcieri commented Dec 10, 2020

It should be quite easy to add a second token which would check SSE2 feature and will use the software fallback if it's not available for some reason.

Under what conditions would that target feature be absent, short of someone explicitly going out of their way to make it happen?

I added explicit target feature gating which handles this case the same way force-soft would. That seems like a much simpler solution.

@tarcieri tarcieri force-pushed the chacha20/avx2-detection branch 3 times, most recently from bb8544d to 9566953 Compare December 10, 2020 18:13
Automatically detects the availability of AVX2 on x86(_64) architectures
based on CPUID, and falls back to the SSE2" backend if unavailable.
@tarcieri tarcieri force-pushed the chacha20/avx2-detection branch from 9566953 to 2b1ab7a Compare December 10, 2020 18:16
@tarcieri tarcieri merged commit 3ce69d1 into master Dec 10, 2020
@tarcieri tarcieri deleted the chacha20/avx2-detection branch December 10, 2020 18:20
@tarcieri tarcieri mentioned this pull request Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants