Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): add support uncompressed model #30949

Conversation

WinterYukky
Copy link
Contributor

@WinterYukky WinterYukky commented Jul 25, 2024

Issue # (if applicable)

None.

Reason for this change

Currently, sagemaker.Model only support ModelDataUrl. ModelDataUrl must be gzip the model artifact.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sagemaker-model-containerdefinition.html#cfn-sagemaker-model-containerdefinition-modeldataurl

LLM, which is popular these days, is large and takes longer to compress. For example, the 70B parameter model could take 2 hours or more to compress. SageMaker has a way to use the uncompressed model, and this PR is needed to support the uncompressed model.

The uncompressed model is also recommended in the official documentation.

When deploying ML models, one option is to archive and compress the model artifacts into a tar.gz format. Although this method works well for small models, compressing a large model artifact with hundreds of billions of parameters and then decompressing it on an endpoint can take a significant amount of time. For large model inference, we recommend that you deploy uncompressed ML model.
https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference-uncompressed.html

Description of changes

Users can now specify the use of the uncompressed model by taking the model's compression type and S3 data type as options. When using an uncompressed model, use ModelDataSource instead of ModelDataUrl. Since ModelDataSource cannot be used in some workloads, the compression model uses ModelDataUrl as before.

Note

Currently you cannot use ModelDataSource in conjunction with SageMaker batch transform, SageMaker serverless endpoints, SageMaker multi-model endpoints, and SageMaker Marketplace.
https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ContainerDefinition.html#sagemaker-Type-ContainerDefinition-ModelDataSource

Also, asset bundling can now be used as a by-product of being able to use something other than tar.gz due to support for uncompressed models. Until now, no model artifacts with extensions other than tar.gz were permitted, but bundled assets are now allowed if they are not directories.

Since it is not possible to upload a directory or extract a zip with s3_asset.asset alone, the directory cannot continue to be used as an asset. (If there is a way to support directories, please let me know)

Description of how you validated changes

I've added an integ test with assertions that invoke the model endpoint and verify the response.

Checklist

Other

sagemaker-alpha module document is not aligned contributing guide recommendation, so also I fix the it.
https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md#recommendations


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added the admired-contributor [Pilot] contributed between 13-24 PRs to the CDK label Jul 25, 2024
@aws-cdk-automation aws-cdk-automation requested a review from a team July 25, 2024 15:43
@github-actions github-actions bot added the p2 label Jul 25, 2024
Comment on lines -73 to -75
if (!path.toLowerCase().endsWith(ARTIFACT_EXTENSION)) {
throw new Error(`Asset must be a gzipped tar file with extension ${ARTIFACT_EXTENSION} (${this.path})`);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asset's availability determination was moved after binding it as an asset.

Comment on lines +172 to +174
if (!this.asset.isFile) {
throw new Error(`Asset must be a file, if you want to use directory you can use 'ModelData.fromBucket()' with the 's3DataType' option to 'S3DataType.S3_PREFIX' and 'compressionType' option to 'CompressionType.NONE' (${this.path})`);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundled assets are now allowed if they are not directories.

Comment on lines +179 to +182
compressionType: this.asset.assetPath.toLowerCase().endsWith(COMPRESSED_ARTIFACT_EXTENSION)
? CompressionType.GZIP
: CompressionType.NONE,
s3DataType: S3DataType.S3_OBJECT,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncompressed single files are also supported

@aws-cdk-automation aws-cdk-automation added the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Jul 25, 2024
This was referenced Jul 29, 2024
@Leo10Gama Leo10Gama self-assigned this Oct 25, 2024
Copy link
Member

@Leo10Gama Leo10Gama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! This is a great start for this feature, but I have a few concerns with the level of abstraction we're providing, specifically with the two enums for CompressionType and S3DataType. We should generally avoid situations where we tell users "if you want to use this, you also have to set this variable that's somewhere else", since it leads to a lesser user experience. If there's some way to refactor this as one enum, I think that would be ideal.

Comment on lines +118 to +119
* When you choose `CompressionType.GZIP` and `S3DataType.S3_OBJECT` then use `ModelDataUrl` property.
* Otherwise, use `ModelDataSource` property.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, I interpret it as telling the user to use the ModelDataUrl or ModelDataSource properties, but looking through the rest of the changes, it seems like this is something handled internally. Is there a reason the user needs to know about what's going on under the hood?

Comment on lines +22 to +26
* If you choose `CompressionType.NONE` and choose `S3DataType.S3_PREFIX` as the value of `s3DataType`,
* S3 URI identifies a key name prefix, under which all objects represents the uncompressed ML model to deploy.
*
* If you choose `CompressionType.NONE`, then SageMaker will follow rules below when creating model data files
* under `/opt/ml/model` directory for use by your inference code:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CompressionType and S3DataType enums seem to be used a lot in conjunction with each other, and when their behaviour changes this drastically between combinations, it makes me wonder if there is any way we can combine the two? Right now, for instance, there doesn't seem to be anything documented about what will happen if I combine CompressionType.GZIP and S3DataType.S3_PREFIX, if that's a valid combination.

Comment on lines +32 to +41
* - Do not use any of the following as file names or directory names:
* - An empty or blank string
* - A string which contains null bytes
* - A string longer than 255 bytes
* - A single dot (.)
* - A double dot (..)
* - Ambiguous file names will result in model deployment failure.
* For example, if your uncompressed ML model consists of two S3 objects `s3://mybucket/model/weights` and `s3://mybucket/model/weights/part1`
* and you specify `s3://mybucket/model/` as the value of S3 URI and `S3DataType.S3_PREFIX` as the value of `s3DataType`,
* then it will result in name clash between `/opt/ml/model/weights` (a regular file) and `/opt/ml/model/weights/` (a directory).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring feels unnecessarily lengthy, and the mention of what file/directory names doesn't seem relevant, at least for the enum. Is there anyway we can add a @see tag and link to the documentation instead for all this extra information?

Comment on lines +368 to +375
modelDataSource: useModelDataSource ? {
s3DataSource: {
s3Uri: modelDataConfig.uri,
s3DataType: modelDataConfig.s3DataType!,
compressionType: modelDataConfig.compressionType!,
},
} : undefined,
modelDataUrl: !useModelDataSource ? modelDataConfig?.uri : undefined,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible for both of these values to be set at the same time? And if so, what sort of behaviour would we expect?

@aws-cdk-automation aws-cdk-automation removed the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Oct 31, 2024
@aws-cdk-automation
Copy link
Collaborator

This PR has been in the CHANGES REQUESTED state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@mergify mergify bot dismissed Leo10Gama’s stale review November 22, 2024 16:06

Pull request has been modified.

Copy link

codecov bot commented Nov 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.77%. Comparing base (bf026bd) to head (351e729).
Report is 175 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #30949   +/-   ##
=======================================
  Coverage   78.77%   78.77%           
=======================================
  Files         108      108           
  Lines        7163     7163           
  Branches     1320     1320           
=======================================
  Hits         5643     5643           
  Misses       1335     1335           
  Partials      185      185           
Flag Coverage Δ
suite.unit 78.77% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
packages/aws-cdk 78.77% <ø> (ø)

@aws-cdk-automation
Copy link
Collaborator

This PR has been in the BUILD FAILING state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 351e729
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

This PR has been in the BUILD FAILING state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@aws-cdk-automation
Copy link
Collaborator

This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error.

@aws-cdk-automation aws-cdk-automation added the closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. label Jan 12, 2025
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
admired-contributor [Pilot] contributed between 13-24 PRs to the CDK closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. p2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants