SSE2+ and AVX2 fun time

Many developers avoid using SIMD intrinsics, so I'm gonna make some fun tests here

The reason why SIMD is faster is that you perform an operation on multiple data at once:

Math test

Note: This test is accurate for comparing a single type of operation to it's SIMD and non-SIMD version. So, you can tell that AVX2 version of float matrix test is faster than SSE2, but you can not compare it to big int group, because they invoke completely different operations.

TL:DR AVX2 is ~30% faster for math than SSE2, and ~3X faster than non-SIMD runs

Bitwise Operations

Benchmark

Benchmark consists of 100 runs of 100 000 XOR ops on 256-bit buffers.

This test was conducted on a single core of AMD R5 3600 at stock clock speed.

Without use of any intrinsics it takes ~400ms to complete 100 000 XOR ops.

AVX hovers around 100ms for the same amount of work but gets beaten by the SSE implementation. Which is interesting, considering that I had to divide 256-bit test buffer into two 128-bit vectors. It is probably caused by the fact that I'm using an AMD CPU.

A bit more diverse test reveals that such a result is also true for other bit operations. What is interesting, is that control runs show less fluctuations than SIMD-accelerated ones.

But anyway, SIMD makes everything run in orders of magnitude faster, so that's cool.

Articles about CPU intrinsics

https://chryswoods.com/vector_c++/emmintrin.html

https://chryswoods.com/vector_c++/immintrin.html

https://chryswoods.com/vector_c++/portable.html

https://www.codeproject.com/Articles/874396/Crunching-Numbers-with-AVX-and-AVX

https://acl.inf.ethz.ch/teaching/fastcode/2021/slides/07-simd-avx.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.vscode		.vscode
benchmarks-data		benchmarks-data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arithmetics.cpp		arithmetics.cpp
bitwise.cpp		bitwise.cpp
cpucapab.cpp		cpucapab.cpp
cpucapab.hpp		cpucapab.hpp
makefile		makefile
scalar-vectorial.svg		scalar-vectorial.svg
workbench.hpp		workbench.hpp
xor-benchmark.cpp		xor-benchmark.cpp
xor-usecase.cpp		xor-usecase.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSE2+ and AVX2 fun time

Math test

Bitwise Operations

Benchmark

Articles about CPU intrinsics

About

Releases

Languages

License

maddsua/simd

Folders and files

Latest commit

History

Repository files navigation

SSE2+ and AVX2 fun time

Math test

Bitwise Operations

Benchmark

Articles about CPU intrinsics

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages