-
Notifications
You must be signed in to change notification settings - Fork 1
5.0.x FeatureList
Aiming for summer 2019.
-
Remove some or all of the deleted MPI-1 and MPI-2 functionality (e.g.,
MPI_ATTR_DELETE
,MPI_UB
/MPI_LB
, ...etc.).- Buckle up: this is a complicated topic. 🙁
- tl;dr: In v5.0.x, remove the C++ bindings, but leave everything else as not-prototyped-in-mpi.h-etc-by-default. Re-evaluate deleting
MPI_ATTR_DELETE
(etc.) in the v6.0.x timeframe.
- tl;dr: In v5.0.x, remove the C++ bindings, but leave everything else as not-prototyped-in-mpi.h-etc-by-default. Re-evaluate deleting
- Prior to Oct 2018:
- All these functions/etc. were marked as "deprecated" (possibly in 2.0.x series? definitely by the 3.0.x series).
- In the v4.0.x series, the C++ bindings are not built by default, and mpi.h/mpif.h/mpi+mpi_f08 modules do not have prototypes/declarations for all the MPI-1 deleted functions and globals (although all the symbols are still present in
libmpi
for ABI compatibility reasons, at the request of our packagers). - v4.0.x does allow using
--enable-mpi1-compatibility
to restore the declarations in mpi.h (and friends).
- At the October 2018 face-to-face meeting:
- We talked specifically about this issue, especially in the context of v5.0.x.
- Even before v4.0.0 was released, we started getting complaints about legacy MPI applications failing to build by default with v4.0.0 prereleases (because the apps used the deleted MPI-1/MPI-2 functionality, and the user didn't build Open MPI with
--enable-mpi1-compatibility
). - Due to this, it feels like we need to spend time educating the MPI community about moving away from the deleted MPI-1/MPI-2 functionality. This may take a while.
- Remember that distros and auto-packagers (e.g., Spack, EasyBuild, etc.) will almost certainly
--enable-mpi1-compatibility
, so some/many users may not even feel the pain of not enabling the functionality by default yet).
- Remember that distros and auto-packagers (e.g., Spack, EasyBuild, etc.) will almost certainly
- As such, we probably shouldn't actually ditch the deleted MPI-1/MPI-2 functionality in v5.0.x.
- Instead, let's focus on educating the MPI community for now, and re-evaluate whether we can actually ditch the deleted MPI-1/MPI-2 functionality in v6.0.x.
- That being said, we all seem to agree that removing the C++ bindings in v5.0.x is not a problem.
- Buckle up: this is a complicated topic. 🙁
-
Delete the openib BTL
- It's effectively unmaintained
- All networks supported by the openib BTL (IB, RoCE, iWARP) now supported by Libfabric and/or UCX.
- There was talk of doing this for v4.0.0
- Hence, it seems that the future of RoCE and iWARP is either or both of Libfabric and UCX.
- ...but neither of those will be 100% ready for Open MPI v4.0.0.
- It didn't seem to make sense to make iWARP users move from
openib
toiwarp
in v4.0.0 (and potentially something similar for non-Mellanox RoCE users), and then move them again to something else in v5.0.0 ("How to annoy your users, 101"). - The lowest cost solution for v4.0.0 was to disable IB support by default in
openib
(i.e., only iWARP and RoCE will use it by default), and punt the ultimate decision about potentially deleting theopenib
BTL to v5.0.0. Note that v4.0.0 also has a "back-door" MCA parameter to enable IB devices, for the "just in case" scenarios (where users, for whatever reason, who don't want to upgrade to UCX).
- With all that, need to investigate and see what the Right course of action is for v5.0.0 (i.e., re-evaluate where Libfabric and/or UCX are w.r.t. RoCE support for non-Mellanox devices and iWARP support), and how to plumb that support into Open MPI / expose it to the user.
-
(UTK) Better multithreading. - George
- In OB1 PML, normal OMP parallel Sections. Improved for injection and extraction rates.
- Implications for other PMLs. Very OB1 specific Maybe a little bit in progress.
-
(UTK) ULFM support via new MPIX functions. Most is in MPIX, but some in PML.
- Depends on PMIx v3.x
-
Want Nathan's fix for Vader and other BTL to allow us to have SOMETHING for OSC_RDMA for one-sided + MT runs.
- something similar coming into BTL-TCP
- If osc/rdma supports all possible scenarios (e.g., all BTLs support the RDMA methods osc/rdma needs), this should allow us to remove osc/pt2pt (i.e., 100% migrated to osc/rdma). Would be good if there was an osc/pt2pt alias in case anyone is scripting their mpirun's to select
--mca osc pt2pt
.
-
Change defaults for embedding libevent / hwloc (see this issue) - HELP NEEDED see PR 5395
-
Simplified network selection (
--net
) CLI option- Initial proposal: see point 20 in https://github.com/open-mpi/ompi/wiki/Meeting-2016-02
- Discussion: search for -net in the Feb meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-2016-02-Minutes
- Further discussion: search for -net in the Aug meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2016-08
-
Displaying what networks were/will be actually used
- See "MPI_Init Connectivity Map (IBM)" in https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2018-03
-
OMPIO stuff (the last 2 missing things -- woo hoo!)
- External32 support
- Support for file atomicity
-
Deleted MPIR interface (MAYBE)
- This assume PMIx debugger interfaces are done / stable.
- We issue a deprecation warning in v4.0.0 for any tool that uses the MPIR interface.
- This topic is under review. Gotten lots of pushback from some important Open MPI users.
-
Use PMIx directly - replace current PMIx component with something more like the hwloc component.
-
Per https://github.com/open-mpi/ompi/pull/6315, we should finish the "reachable" MCA framework
- Then we don't need to worry about default values for
if_include
/if_exclude
values for OOB/TCP (which may be replaced with PRRTE anyway) and BTL/TCP.
- Then we don't need to worry about default values for