Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(knowledgebase): add support for vector type #949

Merged
merged 6 commits into from
Feb 12, 2025

Conversation

krokoko
Copy link
Collaborator

@krokoko krokoko commented Feb 10, 2025

Fixes #948

  • Expose a new parameter for vector knowledge bases, allowing the configuration of the vector type for embeddings
  • Update the Vector Index construct to expose necessary fields
  • Add validation methods to prevent runtime issues
  • Add tests
  • Base CDK version upgrade to v2.178.0

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@krokoko
Copy link
Collaborator Author

krokoko commented Feb 11, 2025

Testing by building 2 vector indexes in a collection:

    // role creation here

    const vectorStore = new opensearchserverless.VectorCollection(this, 'KBVectors');

    vectorStore.grantDataAccess(role);

    const vectorIndex = new opensearch_vectorindex.VectorIndex(this, 'KBIndex', {
      collection: vectorStore,
      indexName: 'mysuperindex',
      vectorField: 'default-vector',
      vectorDimensions: 1024,
      precision: 'Binary',
      distanceType: 'hamming',
      mappings: [
        {
          mappingField: 'AMAZON_BEDROCK_TEXT_CHUNK',
          dataType: 'text',
          filterable: true,
        },
        {
          mappingField: 'AMAZON_BEDROCK_METADATA',
          dataType: 'text',
          filterable: false,
        },
      ],
    });

    vectorIndex.node.addDependency(vectorStore);

    // flaoting point embeddings
    const vectorIndex2 = new opensearch_vectorindex.VectorIndex(this, 'KBIndex2', {
      collection: vectorStore,
      indexName: 'mysuperindex2',
      vectorField: 'default-vector2',
      vectorDimensions: 1024,
      precision: 'float',
      distanceType: 'l2',
      mappings: [
        {
          mappingField: 'AMAZON_BEDROCK_TEXT_CHUNK',
          dataType: 'text',
          filterable: true,
        },
        {
          mappingField: 'AMAZON_BEDROCK_METADATA',
          dataType: 'text',
          filterable: false,
        },
      ],
    });

    vectorIndex2.node.addDependency(vectorStore);

Works as expected

Screenshot 2025-02-11 at 4 18 04 PM Screenshot 2025-02-11 at 4 17 50 PM

@krokoko
Copy link
Collaborator Author

krokoko commented Feb 11, 2025

Deployed the bedrock sample agent and tested as is to ensure no regression. Sample working as expected
Screenshot 2025-02-11 at 4 34 22 PM

@krokoko
Copy link
Collaborator Author

krokoko commented Feb 11, 2025

Updating and deploying the bedrock agent sample by changing the vector type to binary:

...
const kb = new bedrock.VectorKnowledgeBase(this, 'KB', {
      embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
      vectorType: bedrock.VectorType.BINARY,
      instruction: 'Use this knowledge base to answer questions about books. ' +
        'It contains the full text of novels. Please quote the books to explain your answers.',
    });
...

Correctly deployed
Screenshot 2025-02-11 at 4 49 25 PM
:

image

@krokoko krokoko marked this pull request as ready for review February 11, 2025 23:10
Copy link
Contributor

@dineshSajwan dineshSajwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

mergify bot commented Feb 12, 2025

This pull request has been removed from the queue for the following reason: pull request dequeued.

Pull request #949 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: You're not authorized to push to this branch. Visit https://docs.github.com/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches for more information.)

You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it.

If you want to requeue this pull request, you need to post a comment with the text: @mergifyio requeue

@krokoko krokoko merged commit 2e8d5cc into awslabs:main Feb 12, 2025
12 of 15 checks passed
@krokoko krokoko deleted the models_update branch February 12, 2025 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bedrock Vector Knowledge Base: support embedding data type
4 participants