Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the AWS_OFI_NCCL version and add in the MPI HWLOC install #2651

Merged
merged 6 commits into from
Oct 19, 2023

Conversation

willgleich
Copy link
Member

What does this PR do?

This PR offers upgraded capabilities for AWS OFI NCCL version. This will update the OFI NCCL version with adding the required HWLOC package for properly building.

What issue(s) does this change relate to?

GRT-2485

@willgleich willgleich requested a review from a team as a code owner October 17, 2023 18:45
@willgleich willgleich merged commit 174e238 into dev Oct 19, 2023
@willgleich willgleich deleted the will/update_efa_ofi_nccl branch October 19, 2023 16:02
b-chu added a commit that referenced this pull request Oct 27, 2023
* Remove apex test and clean up fsdp warnings  (#2616)

* patch default (#2628)

* Add logging for generate callbacks (#2630)

* Update generate.py

* add missing imports

* Expose input_names and output_names when exporting to ONNX (#2601)

* Expose input_names and output_names when exporting to ONNX

* assert sample_input type for pyright

* fix mocks

---------

Co-authored-by: Mihir Patel <[email protected]>

* Bump version to 0.16.4 (#2627)

* bump version

* filter warning

* remove slack failure

* composer

* ckdn

* commit change

* commit change

* commit change

* commit change

* rename

* revert

* cleanup

* move around tests

* log

* fix slack

* clean test

* composer

* rearrange

* remove logs

* skip

* remove log

---------

Co-authored-by: Chuck Tang <[email protected]>

* many logs

* typos

* logs

* filter

* logs

* fix logs

* monkeypatch sharded tensor

* Add partial state dict functionality for FSDP (#2637)

* Use pytorch chunking

commit-id:e4c9b78f

* Add partial state dict functionality for FSDP

commit-id:2a2cae33

* Update monai requirement from <1.3,>=0.9.1 to >=0.9.1,<1.4 (#2643)

Updates the requirements on [monai](https://github.com/Project-MONAI/MONAI) to permit the latest version.
- [Release notes](https://github.com/Project-MONAI/MONAI/releases)
- [Changelog](https://github.com/Project-MONAI/MONAI/blob/dev/CHANGELOG.md)
- [Commits](Project-MONAI/MONAI@0.9.1...1.3.0)

---
updated-dependencies:
- dependency-name: monai
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump pytest-codeblocks from 0.16.1 to 0.17.0 (#2645)

Bumps [pytest-codeblocks](https://github.com/nschloe/pytest-codeblocks) from 0.16.1 to 0.17.0.
- [Release notes](https://github.com/nschloe/pytest-codeblocks/releases)
- [Commits](nschloe/pytest-codeblocks@v0.16.1...v0.17.0)

---
updated-dependencies:
- dependency-name: pytest-codeblocks
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* remove flush on close (#2646)

* update latest (#2650)

* HSDP Support (#2648)

* add hsdp

* add tuple support

* mod wide

* update

* set default

* disable error validation

* hsdp

* gate import

* Log profile averages (#2647)

Co-authored-by: Mihir Patel <[email protected]>

* bump

* daily key (#2655)

* Add automatic remote uploader downloader for composer profiler (#2653)

* Update the AWS_OFI_NCCL version and add in the MPI HWLOC install (#2651)

* Update the AWS_OFI_NCCL version and add in the MPI HWLOC install

* Move the HWLOC down to the appropriate stage

* Move the HWLOC to the apt-get install

* Remove extra debug arg

---------

Co-authored-by: Mihir Patel <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Charles Tang <[email protected]>
Co-authored-by: Mihir Patel <[email protected]>
Co-authored-by: Anna <[email protected]>
Co-authored-by: Antoine Broyelle <[email protected]>
Co-authored-by: Chuck Tang <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: willgleich <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants