Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): add EndpointConfig L2 construct #22865

Merged
merged 8 commits into from
Nov 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,3 +156,42 @@ import * as sagemaker from '@aws-cdk/aws-sagemaker';
const bucket = new s3.Bucket(this, 'MyBucket');
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
```

## Model Hosting

Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an
HTTPS endpoint where your machine learning model is available to provide inferences.

### Endpoint Configuration

By using the `EndpointConfig` construct, you can define a set of endpoint configuration which can be
used to provision one or more endpoints. In this configuration, you identify one or more models to
deploy and the resources that you want Amazon SageMaker to provision. You define one or more
production variants, each of which identifies a model. Each production variant also describes the
resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also
assign a variant weight to specify how much traffic you want to allocate to each model. For example,
suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1
for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to
model B:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const modelA: sagemaker.Model;
declare const modelB: sagemaker.Model;

const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
instanceProductionVariants: [
{
model: modelA,
variantName: 'modelA',
initialVariantWeight: 2.0,
},
{
model: modelB,
variantName: 'variantB',
initialVariantWeight: 1.0,
},
]
});
```
64 changes: 64 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/accelerator-type.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import * as cdk from '@aws-cdk/core';

/**
* Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants.
* EI instances provide on-demand GPU computing for inference.
*/
export class AcceleratorType {
/**
* ml.eia1.large
*/
public static readonly EIA1_LARGE = AcceleratorType.of('ml.eia1.large');

/**
* ml.eia1.medium
*/
public static readonly EIA1_MEDIUM = AcceleratorType.of('ml.eia1.medium');

/**
* ml.eia1.xlarge
*/
public static readonly EIA1_XLARGE = AcceleratorType.of('ml.eia1.xlarge');

/**
* ml.eia2.large
*/
public static readonly EIA2_LARGE = AcceleratorType.of('ml.eia2.large');

/**
* ml.eia2.medium
*/
public static readonly EIA2_MEDIUM = AcceleratorType.of('ml.eia2.medium');

/**
* ml.eia2.xlarge
*/
public static readonly EIA2_XLARGE = AcceleratorType.of('ml.eia2.xlarge');

/**
* Builds an AcceleratorType from a given string or token (such as a CfnParameter).
* @param acceleratorType An accelerator type as string
* @returns A strongly typed AcceleratorType
*/
public static of(acceleratorType: string): AcceleratorType {
return new AcceleratorType(acceleratorType);
}

private readonly acceleratorTypeIdentifier: string;

constructor(acceleratorType: string) {
if (cdk.Token.isUnresolved(acceleratorType) || acceleratorType.startsWith('ml.')) {
this.acceleratorTypeIdentifier = acceleratorType;
} else {
throw new Error(`instance type must start with 'ml.'; (got ${acceleratorType})`);
kaizencc marked this conversation as resolved.
Show resolved Hide resolved
}
}

/**
* Return the accelerator type as a string
* @returns The accelerator type as a string
*/
public toString(): string {
return this.acceleratorTypeIdentifier;
}
}
Loading