Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gasnet based parcelport #6230

Merged
merged 338 commits into from
Oct 14, 2023
Merged

gasnet based parcelport #6230

merged 338 commits into from
Oct 14, 2023

Conversation

ct-clmsn
Copy link
Contributor

@ct-clmsn ct-clmsn commented Apr 30, 2023

Any background context you want to provide?

Initial work to provide a gasnet-based parcelport implementation. This PR adds cmake-based (via pkg-config) gasnet configuration for gasnet-mpi, gasnet-udp, and gasnet-smp support (gasnet-libfabric, gasnet-ucx, can also be made available as a future effort). This implementation is based on a review of the chapel-comm-gasnet source link.

Gasnet creates a distributed shared segment on each locality. The segment is the size of a maximum gasnet page size; the segment is symmetrically sized. The Chapel developers use the maximum gasnet page size with no modifications. Each locality splits it's local copy of the shared segment into a number of segments equal to the number of localities. Information is sent by copying a parcelport buffer into the sender's shared segment (using std::memcpy) and gasnet::put copies the data from the sender's shared segment into the receiver's shared segment. The receiver polls until the data is received. When the sender's gasnet::put completes, a gasnet active message (on the receiver) copies the received data into the receiver's shared segment. The receiver then copies data from the shared segment into a local parcelport buffer (using std::memcpy).

Gasnet specific compilation flags were added to this PR. Adding the following flags will help test the compilation process:

-DHPX_WITH_PARCELPORT_GASNET=ON
-DHPX_WITH_PARCELPORT_GASNET_CONDUIT={mpi,udp,smp,ofi,ucx,ibv}

GASnet compiles to a static object. In order to get Gasnet working with HPX the following options are available.

Configure GASnet for udp:

CFLAGS=-fPIC CCFLAGS=-fPIC CXXFLAGS=-FPIC ./configure --with-cflags=-fPIC --with-cxxflags=-fPIC --prefix=<INSTALL_PATH>

Configuring GASnet for ofi:

CFLAGS=-fPIC CCFLAGS=-fPIC CXXFLAGS=-FPIC ./configure --enable-ofi --with-ofi-home=<OFI_INSTALL_PATH> --prefix=<INSTALL_PATH> --with-cflags=-fPIC --with-cxxflags=-fPIC

Configuring GASnet for ucx:

CFLAGS=-fPIC CCFLAGS=-fPIC CXXFLAGS=-FPIC ./configure --enable-ucx --with-ucx-home=<UCX_INSTALL_PATH> --prefix=<INSTALL_PATH> --with-cflags=-fPIC --with-cxxflags=-fPIC

A single GASnet build to support smp, udp, ofi, and ucx would look like this:

CFLAGS=-fPIC CCFLAGS=-fPIC CXXFLAGS=-FPIC ./configure --enable-ofi --with-ofi-home=<OFI_INSTALL_PATH> --enable-ucx --with-ucx-home=<UCX_INSTALL_PATH> --prefix=<INSTALL_PATH> --with-cflags=-fPIC --with-cxxflags=-fPIC

Checklist

Not all points below apply to all pull requests.

  • cmake issues resolved; gasnet needs to be compiled with -fPIC (C_FLAGS, CXX_FLAGS)
  • This PR has been tested.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)+(=)

Info

PropertyBeforeAfter
HPX Commit6b6e1e78e96105
HPX Datetime2023-03-06T15:59:25+00:002023-04-30T23:39:42+00:00
Datetime2023-03-10T03:27:49.135034-06:002023-04-30T18:44:38.432343-05:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Commit6b6e1e78e96105
HPX Datetime2023-03-06T15:59:25+00:002023-04-30T23:39:42+00:00
Datetime2023-03-10T03:28:21.991297-06:002023-04-30T18:45:11.603107-05:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)=(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Commit6b6e1e78e96105
HPX Datetime2023-03-06T15:59:25+00:002023-04-30T23:39:42+00:00
Datetime2023-03-10T03:28:29.145749-06:002023-04-30T18:45:18.709807-05:00
Envfile
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented May 1, 2023

Currently getting a compile time bug and could use assistance.

libs/core/gasnet_base/CMakeLists.txt has the following code..

set(gasnet_base_compat_headers
    hpx/plugins/parcelport/gasnet/gasnet.hpp => hpx/modules/gasnet_base.hpp
    hpx/plugins/parcelport/gasnet/gasnet_environment.hpp => hpx/modules/gasnet_base.hpp
)

gasnet.hpp isn't mapping to gasnet_base.hpp and is causing a compilation bug in ./libs/full/parcelport_gasnet/include/hpx/parcelport_gasnet/locality.hpp

any assistance would be appreciated!

@hkaiser
Copy link
Member

hkaiser commented May 1, 2023

Currently getting a compile time bug and could use assistance.

libs/core/gasnet_base/CMakeLists.txt has the following code..

set(gasnet_base_compat_headers
    hpx/plugins/parcelport/gasnet/gasnet.hpp => hpx/modules/gasnet_base.hpp
    hpx/plugins/parcelport/gasnet/gasnet_environment.hpp => hpx/modules/gasnet_base.hpp
)```

`gasnet.hpp` isn't mapping to `gasnet_base.hpp` and is causing a compilation bug in `./libs/full/parcelport_gasnet/include/hpx/parcelport_gasnet/locality.hpp`

You don't need to add any compatibility headers at all. We added those whenever we moved headers around so that exiting code didn't break (right away), but rather generated a warning instead.

@hkaiser
Copy link
Member

hkaiser commented May 1, 2023

@ct-clmsn: You could enable testing of the new parcel-port by adding the necessary configuration switches to the env-*.sh files here: https://github.com/STEllAR-GROUP/hpx/blob/master/.jenkins/lsu. If you need additional pre-requisites installed on Rostam for this, please let us know.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented May 1, 2023

@hkaiser ah, right; found a bunch of small things and made some fixes. Currently blocked on a linking error with parcelport_gasnet_factory_init - this is created with the HPX_STATIC_PARCELPORT_PLUGINS script in ../libs/full/parcelport_gasnet/CMakeLists.txt; not quite sure what a next step looks like...

@hkaiser
Copy link
Member

hkaiser commented May 1, 2023

@hkaiser ah, right; found a bunch of small things and made some fixes. Currently blocked on a linking error with parcelport_gasnet_factory_init - this is created with the HPX_STATIC_PARCELPORT_PLUGINS script in ../libs/full/parcelport_gasnet/CMakeLists.txt; not quite sure what a next step looks like...

You will need something like

namespace hpx::traits {
// Inject additional configuration data into the factory registry for this
// type. This information ends up in the system wide configuration database
// under the plugin specific section:
//
// [hpx.parcel.tcp]
// ...
// priority = 1
//
template <>
struct plugin_config_data<hpx::parcelset::policies::tcp::connection_handler>
{
static constexpr char const* priority() noexcept
{
return "1";
}
static constexpr void init(int* /* argc */, char*** /* argv */,
util::command_line_handling& /* cfg */) noexcept
{
}
// by default no additional initialization using the resource
// partitioner is required
static constexpr void init(hpx::resource::partitioner&) noexcept {}
static constexpr void destroy() noexcept {}
static constexpr char const* call() noexcept
{
return "";
}
};
} // namespace hpx::traits
HPX_REGISTER_PARCELPORT(hpx::parcelset::policies::tcp::connection_handler, tcp)
to generate the needed boilerplate.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented May 2, 2023

Having issues getting multiple include directories added to the target_include_directories command in FindGasnet.cmake (bottom of file). Gasnet requires 2 include directories in order for the build to work.

Providing cmake a list of the 2 required include directories to the Gasnet::gasnet target results in cmake breaking - none of the include directories in the list are added to the Gasnet::gasnet target.

Adding a single directory results in cmake adding a single include directory correctly but this means the second required include directory is missing and the build breaks.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented May 8, 2023

The gasnet header file is found in this path: $GASNET_INSTALL/include/gasnet.h The gasnet header file requires header files installed into a gasnet conduit specific directory; but the conduit directories are not included into gasnet.h using relative paths. Example of conduit directories:

$GASNET_INSTALL/include/mpi-conduit
$GASNET_INSTALL/include/udp-conduit
$GASNET_INSTALL/include/smp-conduit

Because the required header files in each conduit are not included by gasnet.h using relative paths (gasnet-dependency.h instead of mpi-conduit/gasnet-dependency.h), cmake needs to have target_include_directories include both the full path the gasnet.h and the path to the right gasnet conduit directory. As an example, building gasnet using the mpi-conduit needs cmake to generate the following include paths:

-I$GASNET_INSTALL/include -I$GASNET_INSTALL/include/mpi-conduit

This PR is blocked on the include issue and could use some additional help in the FindGasnet.cmake script. The issue seems relatively straightforward to resolve but I've hit several blocks.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)+(=)

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-05-08T02:27:11+00:00
HPX Commit6b6e1e7a57e9f1
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:27:49.135034-06:002023-05-07T21:34:43.471466-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-05-08T02:27:11+00:00
HPX Commit6b6e1e7a57e9f1
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:28:21.991297-06:002023-05-07T21:35:16.595180-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)=(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)=

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-05-08T02:27:11+00:00
HPX Commit6b6e1e7a57e9f1
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:28:29.145749-06:002023-05-07T21:35:23.791864-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@ct-clmsn
Copy link
Contributor Author

@hkaiser was able to navigate the cmake issues and those are resolved. the implementation needs some debugging work.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)+(=)

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T02:39:57+00:00
HPX Commit6b6e1e75fee342
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2023-03-10T03:27:49.135034-06:002023-06-24T21:50:03.227097-05:00
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T02:39:57+00:00
HPX Commit6b6e1e75fee342
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2023-03-10T03:28:21.991297-06:002023-06-24T21:50:36.493067-05:00
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)=
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T02:39:57+00:00
HPX Commit6b6e1e75fee342
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2023-03-10T03:28:29.145749-06:002023-06-24T21:50:43.626088-05:00
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)---(=)

Info

PropertyBeforeAfter
HPX Commit6b6e1e78c9f1fe
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T03:00:53+00:00
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:27:49.135034-06:002023-06-24T22:10:05.951691-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch=

Info

PropertyBeforeAfter
HPX Commit6b6e1e78c9f1fe
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T03:00:53+00:00
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:28:21.991297-06:002023-06-24T22:10:37.897409-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)=

Info

PropertyBeforeAfter
HPX Commit6b6e1e78c9f1fe
HPX Datetime2023-03-06T15:59:25+00:002023-06-25T03:00:53+00:00
Envfile
Clusternamerostamrostam
Datetime2023-03-10T03:28:29.145749-06:002023-06-24T22:10:45.020082-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@ct-clmsn
Copy link
Contributor Author

Gasnet uses the '--param' configuration variable to tune the compiler optimizer. The pkgconfig file stores the use of '--param'. It's causing some issues with cmake. I've contacted the cmake community for assistance.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 7, 2023

@hkaiser @diehlpk the gasnet parcelport's cmake works correctly now. The gasnet parcelport is working over shared memory on 1 node!

Gasnet does not compile into a shared library which causes linking issues. The gasnet development team recommends compiling gasnet with CFLAGS=-fPIC CXX_FLAGS=-fPIC ./configure ....

There are plans to provide an --enable-shared flag but it's being discussed at the moment and might be available in the fall of 2023.

A possible way to streamline the gasnet support for hpx is to do a cmake content fetch of gasnet and do a build for users with the CFLAGS and CXX_FLAGS pre-set for them. gasnet has a lot of compilation options so this may not be the best path forward. I'd be willing to work through those issues in cmake if that's something ya'll don't mind offloading.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)+(=)

Info

PropertyBeforeAfter
HPX Commit6b6e1e799a1c3b
HPX Datetime2023-03-06T15:59:25+00:002023-07-07T12:19:39+00:00
Datetime2023-03-10T03:27:49.135034-06:002023-07-07T07:30:06.459869-05:00
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Commit6b6e1e799a1c3b
HPX Datetime2023-03-06T15:59:25+00:002023-07-07T12:19:39+00:00
Datetime2023-03-10T03:28:21.991297-06:002023-07-07T07:30:39.473892-05:00
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Commit6b6e1e799a1c3b
HPX Datetime2023-03-06T15:59:25+00:002023-07-07T12:19:39+00:00
Datetime2023-03-10T03:28:29.145749-06:002023-07-07T07:30:46.666264-05:00
Clusternamerostamrostam
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser
Copy link
Member

hkaiser commented Jul 7, 2023

Thanks @ct-clmsn. I'll tallk to @akheir, he should be able to recompile gasnet on rostam with those flags set.

Copy link
Contributor

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi - GASNet developer here.

Thanks @ct-clmsn for all the work on this, and we're excited about the possibility of a GASNet-based HPX backend.

I'll defer detailed code review for now. However I'd like to express some high-level concerns about the current state of this PR, providing context that the HPX developers may be lacking:

  1. This PR currently only adds support for the "portable" GASNet conduits (smp, udp, mpi), and not the high-performance "native" GASNet conduits (ibv, aries, ofi).

    • smp-conduit is a good choice for single-node operation (all it supports).
    • udp-conduit is acceptable on Ethernet hardware, but very suboptimal on HPC networks. mpi-conduit is very slow and should basically only be used as a last resort. Neither is recommended for use on HPC-relevant network hardware.
    • If you omit the high-performance GASNet conduits, you're sacrificing most of the potential performance benefits of using GASNet at all.
    • I'm concerned that merging tihs work in a state lacking the high-performance GASNet backends could lead less-informed users to misrepresent the performance potential of a GASNet backend.
  2. This PR is currently using the deprecated GASNet-1 APIs, instead of the modern GASNet-EX API introduced in 2017.

    • As a result, code in this PR cannot take advantage of modern features of GASNet added over the past 6 years.
    • Note that Chapel uses the old APIs for historical reasons, but is currently in the process of being updated to GASNet-EX.
    • Other major GASNet clients (e.g. Legion, UPC++, Berkeley UPC) are all using the GASNet-EX APIs, to their benefit.

CC: @PHHargrove

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 9, 2023

@bonachea thanks for sharing the excitement and concerns. The current implementation is for testing and feasibility - this should explain the use of the older API and current emphasis on portable conduits.

I'll migrate to the gasnet-ex api over time and will provide a couple more gasnet-conduit options in the current cmake scripts.

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind merging this as is at this point. However, I'd like to have some testing in place for this first. We will set up gasnet compiled using the -fPIC flag on rostam at which point we can add the gasnet parcelport to one or more of the CIs running there.

@akheir
Copy link
Member

akheir commented Jul 10, 2023

I recompiled the gasnet library on Rostam with the CFLAGS=-fPIC CXX_FLAGS=-fPIC

@hkaiser
Copy link
Member

hkaiser commented Jul 10, 2023

Thanks @ct-clmsn. I'll tallk to @akheir, he should be able to recompile gasnet on rostam with those flags set.

@ct-clmsn gasnet has been recompiled on rostam now. We should be able to set up the CIs to run all distributed tests (those run on the same node, always) using gasnet as well.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 11, 2023

@hkaiser was able to verify gasnet-udp on an x86 slurm cluster with hpx example/demo programs.

@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)+(=)

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-07-11T02:39:09+00:00
HPX Commit6b6e1e73e997fb
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-03-10T03:27:49.135034-06:002023-07-10T21:44:46.839666-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Envfile

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-07-11T02:39:09+00:00
HPX Commit6b6e1e73e997fb
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-03-10T03:28:21.991297-06:002023-07-10T21:45:19.949797-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Envfile

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add=(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-03-06T15:59:25+00:002023-07-11T02:39:09+00:00
HPX Commit6b6e1e73e997fb
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Datetime2023-03-10T03:28:29.145749-06:002023-07-10T21:45:27.098713-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Envfile

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser
Copy link
Member

hkaiser commented Oct 8, 2023

@Pansysk75 alright, think I've a solution.
The /usr/local copy issue is from regions where ctest is invoked. HPX isn't installed when ctest is exercised. That's a situation the gasnet script can be changed to accommodate.

Oh you are right, looks like in those tests HPX isn't ever installed, so CMAKE_INSTALL_PREFIX is never set to something that makes sense. I'm sorry I missed that detail, glad you figured it out.

Some tests intentionally don't install HPX to verify it can be run from the build directory as well.

@bonachea
Copy link
Contributor

bonachea commented Oct 8, 2023

Some tests intentionally don't install HPX to verify it can be run from the build directory as well.

FWIW GASNet also supports use from the build directory without an install. The pkgconfig *.pc and Makefile fragment *.mak files are located in $(gasnet_build_dir)/*-conduit/ and provide settings that reference the build directory.

@codacy-production
Copy link

codacy-production bot commented Oct 10, 2023

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.10% 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (1ec0c0f) 188709 160577 85.09%
Head commit (8e457dd) 189309 (+600) 161279 (+702) 85.19% (+0.10%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6230) 1 1 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Oct 10, 2023

@hkaiser @Pansysk75 - gasnet now

  1. builds during cmake config
  2. gasnet temporarily installs into build during cmake config
  3. hpx compiles against the temporary gasnet in build
  4. cmake install copies gasnet into CMAKE_INSTALL_PREFIX
  5. the gasnet pkgconfig file is updated dynamically to point to CMAKE_INSTALL_PREFIX

I'm anticipating an LD_LIBRARY_PATH or DYLD_LIBRARY_PATH issue in the ctest portions of the builder scripts; because they don't install hpx into CMAKE_INSTALL_PREFIX.

Update

Found the issue - the builders aren't setting -DCMAKE_C_COMPILER

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Oct 11, 2023

@hkaiser @Pansysk75 quick heads upon the latest attempt.

  1. fails because the GASNet API requires function arguments that are not being used in the parcelport implementation - the solution compiles but registers as a failure b/c of the compile-time warnings.

Jenkins/lsu/clang-12-debug
Jenkins/lsu/clang-12-release
Jenkins/lsu/clang-13-debug
Jenkins/lsu/clang-13-release
Jenkins/lsu/clang-14-debug
Jenkins/lsu/clang-14-release
Jenkins/lsu/clang-15-debug
Jenkins/lsu/clang-15-release

  1. fails due to an uninstalled HPX (hpx doesn't get installed into CMAKE_INSTALL_PREFIX) - think it is a CTest issue - not sure how to fix it.

gcc-10-release
gcc-12-release
gcc-13-debug
gcc-13-release

@Pansysk75
Copy link
Member

@hkaiser @Pansysk75 quick heads upon the latest attempt.

  1. fails because the GASNet API requires function arguments that are not being used in the parcelport implementation - the solution compiles but registers as a failure b/c of the compile-time warnings.

Can this be solved by using the [[maybe_unused]] C++ attribute?

  1. fails due to an uninstalled HPX (hpx doesn't get installed into CMAKE_INSTALL_PREFIX) - think it is a CTest issue - not sure how to fix it.

I think those warnings about uninstalled HPX were there even before your PR. I suspect the failures are due to SLURM issues on Rostam (caused by some recent system changes regarding how jobs are launched). I'll run some tests manually through ssh and I'll get back to you.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Oct 14, 2023

@hkaiser @Pansysk75 the is the best I've been able to get things so far. I only have the ability to see 3 of the failed builds. The 3 fail for these reasons:

  • clang-apex-debug - the apex requirement for rapidjson failed on the git clone operation
  • clang-13-debug - tests.examples.cancelable_action.cancelable_action_client
  • gcc-10-cuda-11-release - tests.unit.modules.concurrency.non_contiguous_index_queue

@Pansysk75
Copy link
Member

@ct-clmsn I can confidently say that all test failures are pre-existing issues unrelated to the parcelport. Nice!

@hkaiser
Copy link
Member

hkaiser commented Oct 14, 2023

@hkaiser @Pansysk75 the is the best I've been able to get things so far. I only have the ability to see 3 of the failed builds. The 3 fail for these reasons:

  • clang-apex-debug - the apex requirement for rapidjson failed on the git clone operation
  • clang-13-debug - tests.examples.cancelable_action.cancelable_action_client
  • gcc-10-cuda-11-release - tests.unit.modules.concurrency.non_contiguous_index_queue

Nice Chris! All the errors you mentioned are unrelated. This now looks like to be ready to be merged! Thanks!

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@STEllAR-GROUP STEllAR-GROUP deleted a comment from bors bot Oct 14, 2023
@hkaiser
Copy link
Member

hkaiser commented Oct 14, 2023

bors merge

@bors
Copy link

bors bot commented Oct 14, 2023

👎 Rejected by code reviews

@hkaiser hkaiser merged commit 94b23c5 into STEllAR-GROUP:master Oct 14, 2023
18 checks passed
@ct-clmsn
Copy link
Contributor Author

@hkaiser @Pansysk75 it's been a journey! Learned a lot. Thank you both for being patient with the process!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.