Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a convenience common multi-level optimizations setups could be defined #7

Open
pedrocr opened this issue Jun 13, 2017 · 2 comments

Comments

@pedrocr
Copy link

pedrocr commented Jun 13, 2017

As a follow-up to #1 it would probably be nice to have a few convenience macros that encode common performance ladders used in several places. For example for SIMD there is a progression in features on x86 and probably a similar one on ARM. It would be nice if there was a convenience `#[runtime_target_simd_features]' or similar that would encode a reasonable performance ladder for both Intel and ARM so this doesn't need to be replicated inconsistently in a bunch of projects.

@parched
Copy link
Owner

parched commented Jun 14, 2017

I agree, do you have a suggestion what the x86 ones might be? I imagine we don't want the full ladder as that would mean a lot of code duplication?

@pedrocr
Copy link
Author

pedrocr commented Jun 14, 2017

Considering this:

https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures

the progression seems to be the following:

  • Just SSE up to SSSE3 (I believe all x86_64 cpus have at least this)
  • SSE4 (introduced in Core)
  • SSE4.2 (introduced in Nehalem and Silvermont for Atom)
  • AVX (introduced in Sandy Bridge)
  • AVX2 (introduced in Haswell)
  • AVX512 (introduced in Skylake)

Having 6 levels in total doesn't sound too much if this is just used in hot paths. A cool optimization would be if somehow it were possible to disable levels that generate the same code but that's probably only possible with deeper links into the LLVM IR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants