Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix Vectorization Flags (long-term / playground) #2426

Closed
wants to merge 3 commits into from
Closed

[WIP] Fix Vectorization Flags (long-term / playground) #2426

wants to merge 3 commits into from

Conversation

svenevs
Copy link
Contributor

@svenevs svenevs commented Sep 10, 2018

Needs rebase, ref: #2432

It looks more complicated than it is. Intel has some really obtuse rules about what can and cannot be used. Particularly, you can only use something like -xAVX2 or /QxAVX2 on an Intel CPU. Similarly, their equivalent of -march=native is -xHOST, with the caveat that when you supply both host and -x or -m (windows /Qx or /arch:), they conflict, so the compiler will warn and use the last one.

Example: GCC / Clang are happy with -march=native -mavx2, but Intel will be mad with -xHOST -xAVX2. For simplicity (since the -xHOST can also only be used with an Intel CPU), I just don't include it in favor of using the manually checked values.

So basically, this function at the top came around because of this preposterous Intel flag scheme, and the absence of -march=native on Intel // AMD CPU, as well as MSVC.

Since there's this (kind of strangely undocumented) feature that the SSE flags are only added when CMAKE_CXX_FLAGS is the empty string, AVX512 people can use that to their advantage to instead configure with -DCMAKE_CXX_FLAGS="-mavx512" or whatever?

TODO:

  • Do it OpenCV style instead most likely.

  • Verify what happens on MSVC is valid. Their docs x86 and x64 claim /arch:AVX2 is valid as well as /arch:AVX. Allegedly since MSVC 2013.

  • See TODO at bottom with MSVC definitions. What exactly is the reason for defining them yourself rather than just relying on the compiler to define them like they promise?

  • What exactly is going on with -mfpmath=sse. The comment makes it sound like concerns about 32bit stuff, but it's added unconditionally for 64 bit builds (and as I understand does not need to be since the abi requires it use xmm registers for calls). Is it still relevant? Do we actually care about 32 bit builds on Linux?

  • Shouldn't this whole thing be nested in a giant if (NOT CMAKE_CROSSCOMPILING) (including the SSE / AVX stuff)? These are being detected for the host architecture not the target architecture (when cross compiling).

  • Travis // OSX:

    CMake Error at cmake/pcl_find_sse.cmake:4 (cmake_host_system_information):
    cmake_host_system_information does not recognize <key> PROCESSOR_DESCRIPTION
    

    Go check the release notes and figure out when this was introduced, add cmake --version to the build scripts for convenience.

@@ -201,6 +201,12 @@ if(CMAKE_COMPILER_IS_CLANG)
SET(CLANG_LIBRARIES "stdc++")
endif()

if(CMAKE_CXX_COMPILER_ID MATCHES "Intel")
if("${CMAKE_CXX_FLAGS}" STREQUAL "")
SET(CMAKE_CXX_FLAGS "${SSE_FLAGS_STR}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be populating the compiler options instead. I'm just talking semantics here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, long-term these should be public compile options assuming you want consuming projects to inherit them (which we should). I just mirrored what the other compilers do here, but since this one is going to be a longer-term effort definitely yes.

I think a more explicit option PCL_SSE_FLAGS or PCL_VECTORIZATION_FLAGS or PCL_PLATFORM_FLAGS or something should be added (cached empty string variable, not an option call). Usage would be something like cmake .. -DPCL_SSE_FLAGS="-march=native" or cmake .. -DPCL_SSE_FLAGS="-xavx512" or whatever, probably allow it to be a list. Then instead of guarding against "${CMAKE_CXX_FLAGS}" STREQUAL "", we would do something like

if (PCL_SSE_FLAGS)
  target_compile_options(pcl PUBLIC ${PCL_SSE_FLAGS})
else()
  include(pcl_sse_flags) # I'll fix up the CMAKE_MODULE_PATH to allow this include
  pcl_check_for_sse()
  if (PCL_SSE_FLAGS)
    target_compile_options(pcl PUBLIC ${PCL_SSE_FLAGS})
  endif()
endif()

Realistically, creating a pcl-interface target is the right thing to do here -- all libs link against that, so we actually would do target_compile_options(pcl-interface INTERFACE ${PCL_SSE_FLAGS}). No need to special-case on compilers. Working with lists is preferred, no need to convert to strings. When you do target_link_libraries(pcl PUBLIC pcl-interface), the INTERFACE_*_ properties propagate from pcl-interface to pcl :)

@svenevs svenevs changed the title Vectorization and intel fixes [WIP] Fix Vectorization Flags (long-term / playground) Sep 11, 2018
@SergioRAgostinho SergioRAgostinho added the needs: more work Specify why not closed/merged yet label Sep 11, 2018
@stale
Copy link

stale bot commented Nov 10, 2018

This pull request has been automatically marked as stale because it hasn't had
any activity in the past 60 days. Commenting or adding a new commit to the
pull request will revert this.

Come back whenever you have time. We look forward to your contribution.

@stale stale bot added the status: stale label Nov 10, 2018
@svenevs
Copy link
Contributor Author

svenevs commented Aug 31, 2019

I won't be able to complete this in its current form. I may release what I've done in a private repo in the future after that goes public and see if it makes sense to add back to PCL. Vectorization flags across compilers is rather tricky with CMake, I've concluded that try_run is the best option (OpenCV does this too) because you can't rely on the OS properly reporting features (lots of issues with noxsave, virtualization, others). So what you have to do is try and actually run vector instructions and see if you get a termination signal, usually SIGILL, but sometimes SIGBUS or SIGTERM depending on OS.

I may actually try and get it into CMake itself, since this is something many projects desire. Time will tell...

@svenevs svenevs closed this Aug 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: more work Specify why not closed/merged yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants