Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect ARM CPU features for host target and in runtime #8298

Merged
merged 20 commits into from
Jul 15, 2024

Conversation

alexreinking
Copy link
Member

@alexreinking alexreinking commented Jun 16, 2024

Adds feature detection to the runtime library and to the host target feature computation.

Not sure what the best way is to share code here. Not sure how best to test on Android or Windows/ARM, either.

Fixes #4727
Fixes #6106
Fixes #7901
Fixes #7979
Fixes #8340

@alexreinking
Copy link
Member Author

Not sure what the best way is to detect the ARMv8.1-A feature. It seems certain other features (e.g. sve/dotprod) imply it, or not (armv7s).

@alexreinking
Copy link
Member Author

alexreinking commented Jun 17, 2024

Regarding the tutorial failures, lesson 15 uses the target string host-x86-64 to infer the OS

But with this PR, host will be something like arm-64-osx-arm_dot_prod-arm_fp16. This then becomes x86-64-osx-arm_dot_prod-arm_fp16-sse4 which makes no sense.

The fundamental issue is that "the host but on a different architecture" isn't a well-defined thing.

Brainstorming a few possible resolutions:

  1. Define changing the arch of a target to clear all arch-specific features
  2. Interpret host in the os-position to mean the host os and no more. The target string in the lesson would become x86-64-host.
    1. Bike-shed: use os or hostos in place of host?
  3. Change the lesson to use x86-64-linux instead of host-x86-64.

@alexreinking alexreinking added the dev_meeting Topic to be discussed at the next dev meeting label Jun 17, 2024
@alexreinking alexreinking marked this pull request as ready for review June 17, 2024 17:47
@alexreinking
Copy link
Member Author

Pending further discussion, I'm using this option to continue making progress:

Change the lesson to use x86-64-linux instead of host-x86-64.

src/Target.cpp Outdated Show resolved Hide resolved
@abadams abadams removed the dev_meeting Topic to be discussed at the next dev meeting label Jun 23, 2024
@abadams abadams added this to the v18.0.0 milestone Jun 23, 2024
@steven-johnson steven-johnson added the release_notes For changes that may warrant a note in README for official releases. label Jun 24, 2024
@steven-johnson
Copy link
Contributor

Ready to land?

@abadams
Copy link
Member

abadams commented Jun 24, 2024

No, the windows ARM code is still just a guess. We're trying to figure out how to test it.

@alexreinking
Copy link
Member Author

I'm trying to test it inside a Windows 11 ARM VM via UTM

@alexreinking
Copy link
Member Author

alexreinking commented Jul 8, 2024

With the latest commit, I have get_host_target confirmed working in the Windows 11 UTM VM. The changes to the LLVM runtime linker were necessary for building Halide with AArch64 but not ARM (to cut down on LLVM build times... building LLVM on ARM64 windows is a nightmare).

image

Related to #2282

@alexreinking alexreinking force-pushed the issues/7979-arm-cpu-features branch from 4bf13d6 to 90f5f6b Compare July 8, 2024 20:16
@alexreinking
Copy link
Member Author

Review ping

@alexreinking alexreinking merged commit 0f34e2f into main Jul 15, 2024
19 checks passed
@alexreinking alexreinking deleted the issues/7979-arm-cpu-features branch July 15, 2024 16:17
steven-johnson pushed a commit that referenced this pull request Jul 15, 2024
Adds feature detection for ARM CPUs to the runtime library and to
the host target feature computation. Supports Windows, macOS,
Linux, iOS, and Android.

Also fix bug in Type::max() and Type::min() for float16.

Fixes #4727
Fixes #6106
Fixes #7901
Fixes #7979
Fixes #8340
steven-johnson added a commit that referenced this pull request Jul 17, 2024
…elease/18.x) (#8343)

Detect ARM CPU features for host target and in runtime (#8298)

Adds feature detection for ARM CPUs to the runtime library and to
the host target feature computation. Supports Windows, macOS,
Linux, iOS, and Android.

Also fix bug in Type::max() and Type::min() for float16.

Fixes #4727
Fixes #6106
Fixes #7901
Fixes #7979
Fixes #8340

Co-authored-by: Alex Reinking <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_notes For changes that may warrant a note in README for official releases.
Projects
None yet
3 participants