Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔱 SageMaker Jumpstart #6355

Open
4 tasks
julialawrence opened this issue Dec 17, 2024 · 2 comments
Open
4 tasks

🔱 SageMaker Jumpstart #6355

julialawrence opened this issue Dec 17, 2024 · 2 comments
Assignees
Labels
spike investigation, discovery into a thing

Comments

@julialawrence
Copy link
Contributor

Context

This came out of the following request made to Cloud Platform: ministryofjustice/cloud-platform#6549

Proposal

We experiment in allowing users to create SageMaker JumpStart instances and endpoints using terraform in our Compute production account.

Outcomes:

  1. Determine if it's possible
  2. If yes, then think about best way to offer this
  3. Test common scenarios: SM in Compute, bucket in Cloud Platform, SM in Compute, bucket in APDP
  4. Consider model promotion (dev, test, prod)

Spike requirements

Ops, 2 Days

Definition of Done

Example

  • Solution has been designed / detailed
  • New issue details the follow-up tasks
  • PoC completed
  • PoC findings captured / documented
@tom-webber
Copy link
Contributor

User is working on the solution, and we will support their efforts

@tom-webber
Copy link
Contributor

Jumpstart model has been accessed from opensearch in Cloud Platform (thread). A separate async model was deployed, accessible using this script:
(I'm considering that the endpoint is generating text as a success, even though I'm certain I'm invoking it wrong):

import json
from time import sleep

import boto3
from sagemaker.jumpstart.model import JumpStartModel
from sagemaker.predictor import Predictor

s3_client = boto3.client("s3", region_name="eu-west-2")
sagemaker_client = boto3.client("sagemaker-runtime", region_name="eu-west-2")

s3_input_path = "s3://mojap-compute-sagemaker-jumpstart-development/input/query.json"
s3_client.put_object(
    Bucket="mojap-compute-sagemaker-jumpstart-development",
    Key="input/query.json",
    Body=json.dumps({"inputs": "What is the capital of France?"}),
)

endpoint = "test-ep-gqzgzrxs"

predictor = Predictor(endpoint, sagemaker_client)


response = sagemaker_client.invoke_endpoint_async(
    EndpointName=endpoint,
    InputLocation=s3_input_path,
    InvocationTimeoutSeconds=120,
    ContentType="application/json",
)

# retrieve the results from the output location (response["OutputLocation"])
output_key = response["OutputLocation"].split(
    "s3://mojap-compute-sagemaker-jumpstart-development/"
)[-1]

print(output_key)
sleep(5)

output = s3_client.get_object(
    Bucket="mojap-compute-sagemaker-jumpstart-development",
    Key=output_key,
)

print(output["Body"].read().decode("utf-8"))

@tom-webber tom-webber moved this from 🚀 In Progress to 🛂 In Review in Analytical Platform Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spike investigation, discovery into a thing
Projects
Status: 🛂 In Review
Development

No branches or pull requests

3 participants