Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simd sort #6326

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

Simd sort #6326

wants to merge 19 commits into from

Conversation

Johan511
Copy link
Contributor

@Johan511 Johan511 commented Aug 19, 2023

Enabling vectorization for sorting using 3rd party library (https://github.com/intel/x86-simd-sort)

(Work in Progress)

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb5415cd65a80
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:50:18.616050-05:002023-08-19T08:25:36.834748-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch=

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb5415cd65a80
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:52:35.047119-05:002023-08-19T08:27:50.442351-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb5415cd65a80
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:52:52.237641-05:002023-08-19T08:28:07.336369-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
cmake/HPX_SetupSimdSort.cmake Outdated Show resolved Hide resolved
cmake/HPX_SetupSimdSort.cmake Outdated Show resolved Hide resolved
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541524dfcd5
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-19T13:45:18.850938-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541524dfcd5
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-19T13:47:31.102556-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541524dfcd5
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-19T13:47:48.033631-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

cmake/FindSimdSort.cmake Outdated Show resolved Hide resolved
cmake/HPX_SetupSimdSort.cmake Outdated Show resolved Hide resolved
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb5415b60c175
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T01:31:05.004560-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb5415b60c175
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T01:33:17.791388-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale=(=)=
Stream Benchmark - Triad(=)=(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb5415b60c175
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T01:33:34.692225-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@Johan511
Copy link
Contributor Author

Johan511 commented Aug 29, 2023

image

Speedup observed for simd-sort

@Johan511
Copy link
Contributor Author

Johan511 commented Aug 29, 2023

@hkaiser There are multiple issues trying to add this integrate this feature into HPX

  1. We need to add feature tests for AVX512
  2. Some vector intrinsic functions are not supported in certain architectures and as a result throw errors. In case of medusa a certain vector load instruction related to 16bit integers was not supported which leads to compile time failure of the target application even if i16 is not being sorted
  3. similar to 2nd point, some architectures might not support certain datatypes (Eg: _Float16) which can cause compile time failures in HPX
  4. If HPX is compiled with simd sort, the target application must be compiled with -march=native (or all the required flags to support vectorization). Else the compilation of the target application will fail

The library has different files for sorting 16bit, 32bit, 64bit numbers, I was considering adding a feature test for each of these files and including them with ifdef guards around them. This would still be an issue if the user compiles HPX and the HPX application (target) and different architectures. This might also lead to some false positives (features which are supported might not be available to user).

I would like your opinion on if I should proceed with the above design.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb54150c93a3b
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T16:57:31.036372-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb54150c93a3b
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T16:59:43.227039-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb54150c93a3b
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T17:00:00.111855-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb541525b4e45
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T17:07:47.004681-05:00

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb541525b4e45
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T17:09:59.091186-05:00

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb541525b4e45
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T17:10:15.975599-05:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@codacy-production
Copy link

codacy-production bot commented Sep 14, 2023

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-69.13%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (103a7b8) 190583 162311 85.17%
Head commit (1c18b4a) 188842 (-1741) 30284 (-132027) 16.04% (-69.13%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6326) 0 0 ∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants