Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

mccauleyp · 2024-11-12T06:51:36Z

Describe the bug

Attempting to use Claude 3.5 v2 or Haiku 3.5 with the Agent construct will produce a successful deployment but a broken agent that produces "Internal server error" responses. That's because these models require invocation via an inference profile but the construct provisions them in an "on demand" mode that isn't compatible.

Expected Behavior

Should be able to deploy agents using these models.

Current Behavior

Agent deployment succeeds but produces "Internal server error" responses.

Reproduction Steps

Create an agent using Sonnet 3.5 v2 or Haiku 3.5, e.g.:

bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0

Possible Solution

I am working around the issue by using the CDK escape hatch to override the CloudFormation foundation model property, which might provide some hints as to how the construct could be modified:

from aws_cdk import Stack, aws_bedrock, aws_iam
from cdklabs.generative_ai_cdk_constructs import bedrock

AGENT_MODEL = bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0
AGENT_INSTRUCTION = "You are a dog, always respond with 'woof woof'."
AGENT_ALIAS_VERSION = "1"


class BedrockResources:
    def __init__(self, scope: Stack) -> None:
        stage_name = get_stage_name(scope)

        agent_name = "my-agent"
        self.agent = bedrock.Agent(
            scope,
            "Agent",
            name=agent_name,
            instruction=AGENT_INSTRUCTION,
            foundation_model=AGENT_MODEL,
        )
        self._enable_inference_profile(
            scope=scope, agent=self.agent, model=AGENT_MODEL
        )

        self.agent_alias = self.agent.add_alias(
            alias_name=f"{agent_name}-v{AGENT_ALIAS_VERSION}"
        )

    @staticmethod
    def _enable_inference_profile(
        scope: Stack, agent: bedrock.Agent, model: bedrock.BedrockFoundationModel
    ) -> None:
        """Enable models that require or support inference profiles.

        Inference profiles are used for cross-region inference, which improves
        performance by enabling load balancing of requests across regions. Certain
        models like Claude Sonnet 3.5 v2 and Haiku 3.5 must use inference profiles

        This is not yet supported by the Agent CDK construct, so we can override the
        configuration on underlying CloudFormation property.
        """
        model_str = model.to_string()
        inference_profile_arn = f"arn:aws:bedrock:{scope.region}:{scope.account}:inference-profile/us.{model_str}"  # noqa: E501
        foundation_model_arn = f"arn:aws:bedrock:*::foundation-model/{model_str}"

        invoke_inference_profile_policy = aws_iam.Policy(
            scope,
            f"InferenceProfilePolicy{agent.name}",
            statements=[
                aws_iam.PolicyStatement(
                    actions=["bedrock:InvokeModel*", "bedrock:GetInferenceProfile"],
                    resources=[foundation_model_arn, inference_profile_arn],
                )
            ],
            roles=[agent.role],
        )

        cfn_agent: aws_bedrock.CfnAgent = agent.node.find_child("Agent")  # type:ignore[assignment]
        cfn_agent.foundation_model = inference_profile_arn
        cfn_agent.node.add_dependency(invoke_inference_profile_policy)

Additional Information/Context

No response

CDK CLI Version

2.166.0

Framework Version

0.1.279

Node.js Version

v20.11.0

OS

OSX

Language

Python

Language Version

3.12

Region experiencing the issue

us-east-1

Code modification

No

Other information

No response

Service quota

I have reviewed the service quotas for this construct

The text was updated successfully, but these errors were encountered:

krokoko · 2024-11-12T19:24:37Z

Thanks for reporting this issue @mccauleyp , this should be fixed when #683 is implemented

krokoko · 2024-11-19T19:05:06Z

@mccauleyp are you able to perform model invocations with the permissions you provided in your code snippet above ?

mccauleyp · 2024-11-19T20:28:57Z

@mccauleyp are you able to perform model invocations with the permissions you provided in your code snippet above ?

Yep! But note that it's not just permissions that complicates using Sonnet 3.5 v2 and Haiku 3.5. If you have an Action Group, the OpenAPI schema must declare the operationId field for each endpoint, and the operationId must start with an HTTP verb (e.g. get_ or post_) and it must be 18 characters or less.

I spent a few hours yesterday debugging to find the 18 character limit; I can't find that documented anywhere in the AWS or Anthropic docs. When I opened this ticket last week, I had it working for two agents but the actions I was using happened to have operationIds that only went up to 17 characters. When I tried upgrading another agent yesterday, the deployment succeeded but the agent replied with "Internal Service Error" until I brought all the operationIds to within 18 characters.

I assume it's the service team that should be alerted of that issue but I'm not sure where best to submit a ticket for that. Could you communicate the character limit issue to them?

krokoko · 2024-11-19T20:38:44Z

Thanks @mccauleyp ! I was asking as there is a bug currently in the console, if you try to create an agent and use a model with CRIS + select the option to generate a new role, the permissions generated are not sufficient to invoke the model through the agent (I was using it as a reference to add support in this lib). I reported this issue to the service team.

Working on adding support for CRIS in #800. With current changes I am able to support CRIS for Agents, and Prompts. Note: Application Inference profiles are not supported yet in CloudFormation. The code will be there and ready on our end though.

Thanks for the note on the operationId, I will contact the service team and post here as soon as I have an update.

mccauleyp · 2024-11-19T20:42:49Z

Ah, yeah I noticed the console bug too and referred to some other docs page for my snippet. One other note for the service or CloudFormation team: If the Action Group schema doesn't include HTTP-verb-prefixed operationIds, the deployment will fail but there's no meaningful error message. I figured that out by creating an agent from the console, which does tell you that the operationId doesn't meet a validation schema. But unfortunately the validation at that point doesn't catch the 18 character thing. I got there from blind trial and error.

krokoko · 2024-11-20T20:20:26Z

Hi @mccauleyp , closing this ticket as https://github.com/awslabs/generative-ai-cdk-constructs/releases/tag/v0.1.283 just released add support for inference profiles. The documentation has an example on how to use CRIS with an agent. For the other points you mentioned, I opened an issue with the service team and will update you as soon as I have an answer. Thank you !

krokoko · 2024-12-26T19:54:52Z

@mccauleyp the service team mentioned that the issue has been fixed (deploying the two models with CRIS for agents). If you face any issues please let us know! Thank you !

mccauleyp added bug Something isn't working needs-triage This issue or PR still needs to be triaged. labels Nov 12, 2024

github-project-automation bot moved this to Backlog in AWS Generative AI Constructs Backlog Nov 12, 2024

github-project-automation bot added this to AWS Generative AI Constructs Backlog Nov 12, 2024

krokoko mentioned this issue Nov 12, 2024

(bedrock): add inference profiles / cross-region inference #683

Closed

2 tasks

krokoko added backlog and removed needs-triage This issue or PR still needs to be triaged. labels Nov 12, 2024

krokoko closed this as completed Nov 20, 2024

emerging-tech-cdk-constructs-bot mentioned this issue Dec 1, 2024

Monthly issue metrics report #830

Closed

krokoko moved this from Backlog to Done in AWS Generative AI Constructs Backlog Dec 10, 2024

mccauleyp mentioned this issue Jan 21, 2025

bedrock: Agents requiring inference profile fail to deploy #899

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

mccauleyp commented Nov 12, 2024 •

edited

Loading

krokoko commented Nov 12, 2024

krokoko commented Nov 19, 2024

mccauleyp commented Nov 19, 2024 •

edited

Loading

krokoko commented Nov 19, 2024 •

edited

Loading

mccauleyp commented Nov 19, 2024

krokoko commented Nov 20, 2024

krokoko commented Dec 26, 2024

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

Comments

mccauleyp commented Nov 12, 2024 • edited Loading

Describe the bug

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

CDK CLI Version

Framework Version

Node.js Version

OS

Language

Language Version

Region experiencing the issue

Code modification

Other information

Service quota

krokoko commented Nov 12, 2024

krokoko commented Nov 19, 2024

mccauleyp commented Nov 19, 2024 • edited Loading

krokoko commented Nov 19, 2024 • edited Loading

mccauleyp commented Nov 19, 2024

krokoko commented Nov 20, 2024

krokoko commented Dec 26, 2024

mccauleyp commented Nov 12, 2024 •

edited

Loading

mccauleyp commented Nov 19, 2024 •

edited

Loading

krokoko commented Nov 19, 2024 •

edited

Loading