From 70004f95700557e39b5fc0a2ca98f6a98700a70a Mon Sep 17 00:00:00 2001
From: Xin An <34663977+xinan1911@users.noreply.github.com>
Date: Tue, 4 Jun 2024 09:58:45 +0200
Subject: [PATCH 1/4] Update known issue about lmod hook in host-injection
---
docs/known_issues/eessi-2023.06.md | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/docs/known_issues/eessi-2023.06.md b/docs/known_issues/eessi-2023.06.md
index a4908b2d1..b3c07382b 100644
--- a/docs/known_issues/eessi-2023.06.md
+++ b/docs/known_issues/eessi-2023.06.md
@@ -7,7 +7,7 @@
This is an error that occurs with OpenMPI after updating to OFED 23.10.
-Their is an upstream issue on this problem opened with EasyBuild.
+
There is an upstream issue on this problem opened with EasyBuild.
See: https://github.com/easybuilders/easybuild-easyconfigs/issues/20233
Workarounds
@@ -26,3 +26,17 @@ export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
```
+
+### `Bug in EESSI initialization and priority mechanisms: site OpenMPI or UCX not loaded`
+
+
+
This error may occur when bugs resolving or site-specific tuning is needed for OpenMPI or UCX.
+
+
There is an issue on this problem opened with EESSI software layer repository.
+See: https://github.com/EESSI/software-layer/issues/456
+
+
Workarounds
+
+
The workaround is to specify site properties and allow defining lmod hooks in host injections (see https://github.com/EESSI/software-layer/pull/525).
+
+
From afb7c245b9e380a61874c1cbf26b0b046d92627a Mon Sep 17 00:00:00 2001
From: Xin An <34663977+xinan1911@users.noreply.github.com>
Date: Tue, 18 Jun 2024 13:59:33 +0200
Subject: [PATCH 2/4] Update mkdocs.yml for structure and remove version pages
of known issues
---
mkdocs.yml | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/mkdocs.yml b/mkdocs.yml
index 086edb9fa..cb2b3f2c3 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -22,19 +22,24 @@ nav:
- Compatibility layer: compatibility_layer.md
- Software layer: software_layer.md
- Supported CPU targets: software_layer/cpu_targets.md
+ - Available software and repositories:
+ - Software: available_software/overview.md
+ - Repositories:
+ - Production: repositories/software.eessi.io.md
+ - RISC-V: repositories/riscv.eessi.io.md
+ - Pilot: repositories/pilot.md
- Installation:
- Is EESSI already installed?: getting_access/is_eessi_accessible.md
- Native: getting_access/native_installation.md
- Container: getting_access/eessi_container.md
+ - Windows and macOS:
+ - Windows with WSL: getting_access/eessi_wsl.md
+ - macOS with Lima: getting_access/eessi_limactl.md
- Basic usage:
- Set up environment: using_eessi/setting_up_environment.md
- Basic commands: using_eessi/basic_commands.md
- Demos: using_eessi/eessi_demos.md
- Advanced usage:
- - Repositories:
- - Production: repositories/software.eessi.io.md
- - RISC-V: repositories/riscv.eessi.io.md
- - Pilot: repositories/pilot.md
- Setting up your Stratum: filesystem_layer/stratum1.md
- Building software with EESSI: using_eessi/building_on_eessi.md
- Test suite:
@@ -46,6 +51,8 @@ nav:
- Release notes: test-suite/release-notes.md
- Accelerators support:
- GPUs: gpu.md
+ - Known issues and workarounds:
+ - v2023.06: known_issues/eessi-2023.06.md
- Adding software to EESSI:
- Overview: adding_software/overview.md
- For contributors:
@@ -57,10 +64,6 @@ nav:
- Building software: adding_software/building_software.md
- Deploying software: adding_software/deploying_software.md
- Build nodes: software_layer/build_nodes.md
- - Known issues:
- - v2023.06: known_issues/eessi-2023.06.md
- - v2022.02: []
- - pilot: []
- Community and support:
- Getting support: support.md
- Meetings: meetings.md
From eb02210c8bd63a1384374636a8987b32fab9394a Mon Sep 17 00:00:00 2001
From: Xin An <34663977+xinan1911@users.noreply.github.com>
Date: Tue, 18 Jun 2024 14:16:36 +0200
Subject: [PATCH 3/4] Adding reference to Lmod hooks
---
docs/known_issues/eessi-2023.06.md | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/docs/known_issues/eessi-2023.06.md b/docs/known_issues/eessi-2023.06.md
index b3c07382b..fdc595331 100644
--- a/docs/known_issues/eessi-2023.06.md
+++ b/docs/known_issues/eessi-2023.06.md
@@ -25,18 +25,7 @@ export OMPI_MCA_btl='^uct,ofi'
export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
```
-
-
-### `Bug in EESSI initialization and priority mechanisms: site OpenMPI or UCX not loaded`
-
-
-
This error may occur when bugs resolving or site-specific tuning is needed for OpenMPI or UCX.
-
-
There is an issue on this problem opened with EESSI software layer repository.
-See: https://github.com/EESSI/software-layer/issues/456
-
-
Workarounds
-
The workaround is to specify site properties and allow defining lmod hooks in host injections (see https://github.com/EESSI/software-layer/pull/525).
+You may also set these additional environment variables via site-specific Lmod hooks. For more information about how to write and implement site-specific Lmod hooks, please check [EESSI Site Specific Configuration LMOD Hooks](site_specific_config/lmod_hooks.md)
From 0b6c2e8fa441c2ba3d3f2cb273870663e3c75841 Mon Sep 17 00:00:00 2001
From: Xin An <34663977+xinan1911@users.noreply.github.com>
Date: Wed, 19 Jun 2024 13:58:49 +0200
Subject: [PATCH 4/4] Update docs/known_issues/eessi-2023.06.md
Co-authored-by: Caspar van Leeuwen <33718780+casparvl@users.noreply.github.com>
---
docs/known_issues/eessi-2023.06.md | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/docs/known_issues/eessi-2023.06.md b/docs/known_issues/eessi-2023.06.md
index fdc595331..41204425d 100644
--- a/docs/known_issues/eessi-2023.06.md
+++ b/docs/known_issues/eessi-2023.06.md
@@ -26,6 +26,30 @@ export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
```
-You may also set these additional environment variables via site-specific Lmod hooks. For more information about how to write and implement site-specific Lmod hooks, please check [EESSI Site Specific Configuration LMOD Hooks](site_specific_config/lmod_hooks.md)
+You may also set these additional environment variables via site-specific Lmod hooks:
+```
+require("strict")
+local hook=require("Hook")
+
+-- Fix Failed to modify UD QP to INIT on mlx5_0: Operation not permitted
+function fix_ud_qp_init_openmpi(t)
+ local simpleName = string.match(t.modFullName, "(.-)/")
+ if simpleName == 'OpenMPI' then
+ setenv('OMPI_MCA_btl', '^uct,ofi')
+ setenv('OMPI_MCA_pml', 'ucx')
+ setenv('OMPI_MCA_mtl', '^ofi')
+ end
+end
+
+local function combined_load_hook(t)
+ if eessi_load_hook ~= nil then
+ eessi_load_hook(t)
+ end
+ fix_ud_qp_init_openmpi(t)
+end
+
+hook.register("load", combined_load_hook)
+```
+ For more information about how to write and implement site-specific Lmod hooks, please check [EESSI Site Specific Configuration LMOD Hooks](site_specific_config/lmod_hooks.md)