Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: register_bedrock now supports arn #118

Merged
merged 5 commits into from
Feb 3, 2025
Merged

feat: register_bedrock now supports arn #118

merged 5 commits into from
Feb 3, 2025

Conversation

rsharath
Copy link
Contributor

@rsharath rsharath commented Feb 1, 2025

This branch has the following changes:

  1. If the bedrock request has an arn, then we attempt to use the arn to lookup the modelID
  2. If the model has a date embedded in it, then we attempt to remove it (makes route management a lot easier)

import httpx

import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP convention for import


if arn:
# Retrieve the inference profile using the ARN.
get_profile_response = self.bedrock_client.get_inference_profile(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could save these responses or mapping in javelin-client, ARN aren't going to change once they are created in AWS ecosystem. What do you think?


# Update the request URL to use the Javelin endpoint.
# Replace the scheme and netloc based on self.base_url while preserving path and query.
new_netloc = urlparse(self.base_url).netloc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect changing inference profile ARN with model id could break their inference profile output... like
Miss cost optimization
Lack monitoring and metrics or whatever application inference has to offer.

what do you say?

rsharath and others added 2 commits February 2, 2025 14:01
…d. Fail silently in case of error (this needs to be enhanced to add async tracing). 2. added logic to pass a default_route_name for bedrock models optionally when registering bedrock_client, 3. added logic to always etract the model and set the x-javlein-model header (needed for model_spec in case if are not rewriting the url and we get an arn instead of model id
@ankumar ankumar merged commit 0394bcf into main Feb 3, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants