Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepspeed Support #188

Merged
merged 21 commits into from
Jul 8, 2024
Merged

Deepspeed Support #188

merged 21 commits into from
Jul 8, 2024

Conversation

tylertitsworth
Copy link
Contributor

@tylertitsworth tylertitsworth commented Jun 28, 2024

Description

Customer request to add deepspeed support with OneCCL integration to IPEX MultiNode Container.

Related Issue

MLOPS-2007

Changes Made

  • pip reqs to be displayed correctly in docs matrix
  • remove unnecessary requirements files from container image
  • add apt reqs for deepspeed
  • build deepspeed with ops after installing multinode reqs
  • add pip reqs for deepspeed
  • update ipex-resnet50 test to include deepspeed mode
  • add deepspeed integration test
  • add deepspeed report test
  • The code follows the project's coding standards.
  • No Intel Internal IP is present within the changes.
  • The documentation has been updated to reflect any changes in functionality.

Validation

All done locally.

  • I have tested any changes in container groups locally with test_runner.py with all existing tests passing, and I have added new tests where applicable.

@tylertitsworth tylertitsworth added the WIP Work in Progress label Jun 28, 2024
@tylertitsworth tylertitsworth self-assigned this Jun 28, 2024
tylertitsworth and others added 2 commits June 28, 2024 15:37
Signed-off-by: tylertitsworth <[email protected]>
@tylertitsworth tylertitsworth force-pushed the tylertitsworth/deepspeed branch from 466a2b1 to 5fcb25c Compare June 28, 2024 22:37
Copy link

github-actions bot commented Jun 28, 2024

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 1 package(s) with unknown licenses.
See the Details below.

License Issues

pytorch/multinode/requirements.txt

PackageVersionLicenseIssue Type
oneccl-devel>= 2021.13.0NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
pip/mpi4py >= 3.1.0 🟢 6.2
Details
CheckScoreReason
Maintained🟢 1030 commit(s) and 12 issue activity found in the last 90 days -- score normalized to 10
Code-Review⚠️ 0Found 0/14 approved changesets -- score normalized to 0
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
License🟢 10license file detected
Signed-Releases⚠️ 0Project has not signed or included provenance with any releases.
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Binary-Artifacts🟢 10no binaries found in the repo
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
Security-Policy⚠️ 0security policy file not detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Packaging🟢 10packaging workflow detected
SAST🟢 10SAST tool is run on all commits
pip/oneccl-devel >= 2021.13.0 UnknownUnknown

Scanned Manifest Files

pytorch/multinode/requirements.txt
  • mpi4py@>= 3.1.0
  • oneccl-devel@>= 2021.13.0

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

tylertitsworth added 3 commits July 1, 2024 08:50
tylertitsworth added 3 commits July 3, 2024 14:15
@tylertitsworth tylertitsworth added Review and removed WIP Work in Progress labels Jul 3, 2024
@tylertitsworth tylertitsworth marked this pull request as ready for review July 3, 2024 21:28
Tyler Titsworth added 4 commits July 3, 2024 14:31
Signed-off-by: Tyler Titsworth <[email protected]>
Signed-off-by: tylertitsworth <[email protected]>
Signed-off-by: tylertitsworth <[email protected]>
@tylertitsworth tylertitsworth force-pushed the tylertitsworth/deepspeed branch from 502943f to 80e4f38 Compare July 4, 2024 04:37
tylertitsworth added 2 commits July 8, 2024 12:10
Signed-off-by: tylertitsworth <[email protected]>
Signed-off-by: tylertitsworth <[email protected]>
@tylertitsworth tylertitsworth force-pushed the tylertitsworth/deepspeed branch from be3a901 to acfeaae Compare July 8, 2024 19:14
Copy link
Contributor

@sramakintel sramakintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@tylertitsworth tylertitsworth merged commit 431abb6 into main Jul 8, 2024
38 of 40 checks passed
@tylertitsworth tylertitsworth deleted the tylertitsworth/deepspeed branch July 8, 2024 22:58
dmsuehir pushed a commit that referenced this pull request Jul 12, 2024
Signed-off-by: tylertitsworth <[email protected]>
Signed-off-by: Tyler Titsworth <[email protected]>
Co-authored-by: Sharvil Shah <[email protected]>
Signed-off-by: Dina Suehiro Jones <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants