Skip to content
This repository has been archived by the owner on Nov 5, 2024. It is now read-only.

Agents for Amazon Bedrock Runtime not returning source metadata in APIs #746

Closed
tim-finnigan opened this issue May 15, 2024 · 25 comments
Closed
Assignees
Labels
bedrock bug Something isn't working service-api This issue pertains to the AWS API

Comments

@tim-finnigan
Copy link
Contributor

tim-finnigan commented May 15, 2024

Original issue: boto/boto3#4124 (ref: P131777621)

@tim-finnigan tim-finnigan added bug Something isn't working service-api This issue pertains to the AWS API bedrock labels May 15, 2024
@tim-finnigan tim-finnigan self-assigned this May 15, 2024
@sidatcd
Copy link

sidatcd commented May 26, 2024

@tim-finnigan Any ETA on this?

@edu2105
Copy link

edu2105 commented May 27, 2024

I was getting crazy searching for a parameter, flag or configuration I missed in order to show the source metadata.
While the invoke_agent documentation shows that the metadata should come within the retrievedReferences object, in the documentation for the trace-events under OrchestrationTrace --> Observation the 'metadata' object it is not present https://docs.aws.amazon.com/bedrock/latest/userguide/trace-events.html

Thanks god I've found this open issue.
Hope there is an update soon 🙏 and thanks for working on this.

@adoyon23
Copy link

adoyon23 commented Jun 4, 2024

I was able to resolve this by upgrading to boto3 1.34.118

@tim-finnigan
Copy link
Contributor Author

Thanks for confirming @adoyon23 - I'll go ahead and close this issue. For anyone encountering this issue please update to the latest version of the AWS SDK. Here is the Boto3 CHANGELOG for reference, showing 1.34.118 as the latest version.

Copy link

github-actions bot commented Jun 4, 2024

This issue is now closed.

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@edu2105
Copy link

edu2105 commented Jun 4, 2024

Hi @adoyon23
I've been trying with both 1.34.118 and the latest one 1.34.119 versions, but still cannot find the metadata object within retrievedReferences when calling the client.invoke_agent method.
I can confirm that the metadata is showing when using the Knowledge Base playground but not when using the invoke_agent.

Did you have to enable or configure something else?

@sriram-aws
Copy link

Hi @tim-finnigan - I am still not able to find resolution for this issue. invoke_agent still doesnt have metadata in the output.

@tim-finnigan
Copy link
Contributor Author

tim-finnigan commented Jun 5, 2024

Hi @edu2105 @sriram-aws thanks for following up. After speaking with an engineer on the Bedrock team, I was told that they are aware of this issue and are planning a fix soon. Will keep this open to track for now.

@tim-finnigan tim-finnigan added response-requested This issue requires a response to continue and removed investigating response-requested This issue requires a response to continue labels Jun 5, 2024
@sriram-aws
Copy link

@tim-finnigan - Thanks for the update!

@motigors
Copy link

motigors commented Jun 9, 2024

Hi,

I'm experiencing an issue with the bedrock-agent-runtime retrieve_and_generate service. When I invoke this service, the retrievedReferences (retrieve metadata) is not included in the response. This functionality worked until the middle of last week, but it has since stopped working with any version of boto3.

Please confirm if this is related to this issue, or should I open a new bug report it?
If it is the same issue, do you have an estimated timeline for a fix?

python: 3.11
boto3: 1.34.122

Request:

input={
    'text': 'some question'
},
retrieveAndGenerateConfiguration={
    'type': 'KNOWLEDGE_BASE',
    'knowledgeBaseConfiguration': {
        'knowledgeBaseId': '1234',
        'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0',
        'retrievalConfiguration': {
            'vectorSearchConfiguration': {
                'numberOfResults': 3
            }
        },
        'generationConfiguration': {
            'inferenceConfig': {
                'textInferenceConfig': {
                    "temperature": 0.1,
                }
            }
        }
    },
}

Response:

{
    "ResponseMetadata": {
        "HTTPHeaders": {
            "connection": "keep-alive",
            "content-length": "1030",
            "content-type": "application/json",
            "date": "Sun, 09 Jun 2024 13:18:00 GMT",
            "x-amzn-requestid": "1234"
        },
        "HTTPStatusCode": 200,
        "RequestId": "1234",
        "RetryAttempts": 0
    },
    "citations": [
        {
            "generatedResponsePart": {
                "textResponsePart": {
                    "span": {
                        "end": 411,
                        "start": 0
                    },
                    "text": "generated response"
                }
            },
            "retrievedReferences": []
        }
    ],
    "output": {
        "text": "generated response"
    },
    "sessionId": "1234"
}

@tim-finnigan
Copy link
Contributor Author

@motigors that appears to be a related issue. No timeline on addressing this but the Bedrock team informed me that they are working on it.

@motigors
Copy link

Thanks for the update

@ncagle
Copy link

ncagle commented Jun 20, 2024

@tim-finnigan I'm having a similar issue, but with the bedrock-agent-runtime retrieve method. It seems like it's related.

I'm glad to see this is being worked on! If there are any updates, it'd be great to hear. If my issue is unrelated, I can open a new issue if there isn't a relevant one already.

Here's some info about what I was doing in case it helps. I haven't been able use a RetrievalFilter in the vectorSearchConfiguration either. But I have confirmed in the bedrock console testing that the metadata exists and is working correctly for filtering.

Python 3.12
boto3: 1.34.42

response = bedrock_agent_runtime_client.retrieve(
    knowledgeBaseId="ABC1234567",
    # nextToken="",
    retrievalConfiguration={
        "vectorSearchConfiguration": {
            "numberOfResults": 6,
        }
    },
    retrievalQuery={
        "text": "some information"
    }
)

Response:

{
    'ResponseMetadata': {
        'HTTPHeaders': {
            'connection': 'keep-alive',
            'content-length': '16283',
            'content-type': 'application/json',
            'date': 'Thu, 20 Jun 2024 19:38:59 GMT',
            'x-amzn-requestid': 'totally-real-id'
        },
        'HTTPStatusCode': 200,
        'RequestId': 'totally-real-id',
        'RetryAttempts': 0
    },
    'retrievalResults': [
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_1.pdf'},
                      'type': 'S3'},
         'score': 0.7049113},
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_2.pdf'},
                      'type': 'S3'},
         'score': 0.704336},
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_3.pdf'},
                      'type': 'S3'},
         'score': 0.700758},
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_4.pdf'},
                      'type': 'S3'},
         'score': 0.70058656},
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_5.pdf'},
                      'type': 'S3'},
         'score': 0.7002422},
        {'content': {'text': 'relevant stuff and things'},
         'location': {'s3Location': {'uri': 's3://bucket/subdirectory/document_6.pdf'},
                      'type': 'S3'},
         'score': 0.6993249}
    ]
}

@wadebev11
Copy link

I've run into this bug as well. Any updates?

@tim-finnigan
Copy link
Contributor Author

We were informed by the Bedrock team that this should be fixed now, please try updating to the latest version of your SDK and let us know if still running into any issues.

@tim-finnigan tim-finnigan added the closing-soon This issue will be closed soon label Jul 11, 2024
@ncagle
Copy link

ncagle commented Jul 11, 2024

I'm still having the issue with the retrieve function not returning any metadata.
I'm using a Python 3.12 Lambda function with boto3 version 1.34.42.

When I check the retrieval results, I'm expecting to get something like this (from the documentation).

{
   "nextToken": "string",
   "retrievalResults": [ 
      { 
         "content": { 
            "text": "string"
         },
         "location": { 
            "confluenceLocation": { 
               "url": "string"
            },
            "s3Location": { 
               "uri": "string"
            },
            "salesforceLocation": { 
               "url": "string"
            },
            "sharePointLocation": { 
               "url": "string"
            },
            "type": "string",
            "webLocation": { 
               "url": "string"
            }
         },
         "metadata": { 
            "string" : JSON value 
         },
         "score": number
      }
   ]
}

But the only keys in the retrieval results I'm getting are "content", "location", and "score".

The other day I created a new knowledge base using Pinecone for the vector database hoping that might have a different response, but it was the same. When I check the vector database in Pinecone though, I'm see that all of the metadata from the metadata files I created has been included.

image

Let me know if there's any more information I can provide that might help.

@tim-finnigan
Copy link
Contributor Author

Hi @ncagle — can you update to the latest Boto3 version (1.34.143 per the CHANGELOG) and let us know if still an issue in the latest version?

@tim-finnigan tim-finnigan added response-requested This issue requires a response to continue and removed closing-soon This issue will be closed soon labels Jul 11, 2024
@ncagle
Copy link

ncagle commented Jul 11, 2024

I'll see if I can manually upgrade the boto3 version in the lambda function. It's currently running the latest execution environment supported version of the SDK as far as I know. From what I've seen, there's not an official way to upgrade the package within Lambda. If you know of a way, please let me know.

@tim-finnigan
Copy link
Contributor Author

Here is documentation on bundling Python dependencies like Boto3 in Lambda: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html. Also this Knowledge Center post: https://repost.aws/knowledge-center/lambda-python-runtime-errors.

@github-actions github-actions bot removed the response-requested This issue requires a response to continue label Jul 12, 2024
@ncagle
Copy link

ncagle commented Jul 12, 2024

@tim-finnigan It worked! That was my first time creating a deployment package with dependencies as a layer, but that Knowledge Center post saved me. I'm correctly getting all of the metadata attributes in the retrieve response now.

Is there somewhere that tracks when the supported version of the SDK for Lambda functions will be updated? For now, I can just leave this layer in place until it catches up with the latest version of the SDK. Thanks for your help!

@edu2105
Copy link

edu2105 commented Jul 12, 2024

Hi @tim-finnigan thanks for the update!
I have updated boto3 to the latest version and ran my process but unfortunately, when running boto3.invoke_agent method I'm still not able to find the metadata object within retrievedReferences maybe it is working for other methods and not for invoke_agent for the moment?

This is the log showing that I'm using the latest boto3 version
image

And this is part of the log where you can see that only gives back the content and location objects
image

[...] "trace":{ "orchestrationTrace":{ "observation":{ "knowledgeBaseLookupOutput":{ "retrievedReferences":[ { "content":{ "text":"SCRUBBED" }, "location":{ "s3Location":{ "uri":"s3://xxxxx/gdrive/xxxx.pdf" }, "type":"S3" } } ] } } } } } [...]

@malikalimoekhamedov
Copy link

Friends, the fact that the metadata is being returned now is fabulous. However, my Bedrock knowledge base S3 bucket is populated with documents that have our custom metadata, such as x-amz-meta-doi, or x-amz-meta-ncbi_search_term. These, however, are not being returned by the TypeScript SDK. Is there a reason for this? What can I do about it?

@edu2105
Copy link

edu2105 commented Jul 12, 2024

Hi @malikalimoekhamedov from my understanding, only the metadata attributes that were included within .metadata.json Document metadata files and uploaded to your S3 bucket will be the metadata returned in the response.
There are certain conditions needed and are mentioned here https://docs.aws.amazon.com/bedrock/latest/userguide/s3-data-source-connector.html#configuration-s3-connector

For the invoke_agent() method, I was not able to find them even using the latest boto3 version. If you are using another method, maybe you will be able to get your metadata in the response.

@tim-finnigan
Copy link
Contributor Author

Since the original issue was addressed here (and that was confirmed by the Bedrock team) I'm going to close this as resolved. The issue involving retrievedReferences is being tracked in #777. Please try using https://repost.aws/ to ask about questions involving service APIs.

Copy link

This issue is now closed.

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bedrock bug Something isn't working service-api This issue pertains to the AWS API
Projects
None yet
Development

No branches or pull requests

9 participants