Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further vectorisation of BitArray? #29661

Closed
Gnbrkm41 opened this issue May 25, 2019 · 7 comments
Closed

Further vectorisation of BitArray? #29661

Gnbrkm41 opened this issue May 25, 2019 · 7 comments
Labels
Milestone

Comments

@Gnbrkm41
Copy link
Contributor

I've noticed a couple things while strolling through the source of BitArray:

  • And, Or, Xor have Sse2 path however it does not have Avx2 path
  • Not is not vectorised at all

I tested using a relatively simple implementation (....I pretty much copy & pasted the Sse2 path and amended it for Avx2/Vector256 :^) ), and it looks like vectorising might have some positive effect on the time taken.

(See https://github.com/Gnbrkm41/BitArrayIntrinsicsBenchmark)
N is the size of the array (in bits), methods with _Opt suffix are the vectorised methods and _Unopt are the original methods.

Method N Mean Error StdDev Rank
And_Opt 8388608 33,938.916 ns 376.3072 ns 314.2336 ns 36
Or_Opt 8388608 34,029.502 ns 297.8839 ns 278.6408 ns 36
Xor_Opt 8388608 33,466.440 ns 541.9196 ns 506.9119 ns 36
Not_Opt 8388608 22,278.547 ns 54.4056 ns 48.2292 ns 34
And_Unopt 8388608 42,449.080 ns 414.0216 ns 387.2761 ns 38
Or_Unopt 8388608 42,595.040 ns 574.9606 ns 537.8185 ns 38
Xor_Unopt 8388608 42,963.108 ns 508.2865 ns 475.4516 ns 38
Not_Unopt 8388608 88,049.039 ns 686.3357 ns 573.1214 ns 39
And_Opt 8388613 32,711.390 ns 196.6901 ns 174.3607 ns 35
Or_Opt 8388613 32,282.328 ns 501.3276 ns 468.9421 ns 35
Xor_Opt 8388613 32,521.586 ns 634.6536 ns 593.6554 ns 35
Not_Opt 8388613 22,272.668 ns 37.5989 ns 33.3304 ns 34
And_Unopt 8388613 38,650.046 ns 640.1163 ns 534.5261 ns 37
Or_Unopt 8388613 38,996.726 ns 763.5589 ns 965.6576 ns 37
Xor_Unopt 8388613 38,742.539 ns 182.6211 ns 152.4969 ns 37
Not_Unopt 8388613 89,282.173 ns 1,320.3705 ns 1,170.4741 ns 39

Overall, There has been about 16% decrease in execution time for And/Or/Xor and 70% decrease in execution time for Not.

Would it be worth adding the paths & vectorising the Not method? I don't have extensive knowledge on raw instructions so I can't tell much about the Avx2 path however the Not case definitely looks convincing to me.

@danmoseley
Copy link
Member

@tannergooding is the right person here

@danmoseley
Copy link
Member

The numbers look very promising though!

@danmoseley
Copy link
Member

Could you share a link to the branch?

@Gnbrkm41
Copy link
Contributor Author

https://github.com/Gnbrkm41/corefx/tree/bitarrayintrinsics. Looks like the tests pass w/o problem, but I'm not sure if mine's the most optimal. Feedbacks... and more tricks appreciated.

@EgorBo
Copy link
Member

EgorBo commented May 26, 2019

@Gnbrkm41 It's important to keep performance the same for all values, not only for some large arrays (where AVX is obviously good). I see you extended the switch from 4 to 8 cases, I am not sure it will be converted to a jump table and also I bet there will be a performance regression between 4 and 8. So could you please benchmark those against the current implementation?

@Gnbrkm41
Copy link
Contributor Author

Gnbrkm41 commented May 27, 2019

I don't expect mine to be the absolute best (in fact I'm not really good at optimisation), it was just a simple proof-of-concept so it can get some attention. Would love to learn from it hopefully :^)

It looked like there was some regression for any case that isn't vectorised (i.e. <7 elements in the array), about 3 nanoseconds. though once it gets over 8 elements it did get faster than the previous ones.

@Gnbrkm41
Copy link
Contributor Author

Gnbrkm41 commented Nov 6, 2019

Closing in favour of dotnet/corefx#41762

@Gnbrkm41 Gnbrkm41 closed this as completed Nov 6, 2019
@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 5.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants