Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: BadRequest: 400 Error in Query Engine Using VertexAIVectorStore #17595

Open
naseemshah opened this issue Jan 23, 2025 · 3 comments
Open
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@naseemshah
Copy link

Bug Description

When using the VectorStoreIndex with VertexAIVectorStore, querying the index with query_engine.query() results in a BadRequest: 400 error. The error message indicates a problem with the findNeighbors API call, specifically an unrecognized parameter %24alt.

However, directly creating the VectorStoreIndex with the nodes parameter (not using Vertex AI store) bypasses the issue and works as expected.

# setup storage
vector_store = VertexAIVectorStore(
    project_id=PROJECT_ID,
    region=REGION,
    index_id=vs_index.resource_name,
    endpoint_id=vs_endpoint.resource_name,
    gcs_bucket_name=GCS_BUCKET_NAME,
    credentials_path="service-account.json"
)


# set storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

vector_store.add(nodes)

# Create index for the review nodes
vector_index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)
# Create query engine
query_engine = vector_index.as_query_engine()


query = "what does people say about hdmi cables?"

# Perform the query
response = query_engine.query(query)

# Print the response
print(f"Response:")
print("-" * 80)
print(response.response)

The error:

---------------------------------------------------------------------------
BadRequest                                Traceback (most recent call last)
Cell In[40], [line 5](vscode-notebook-cell:?execution_count=40&line=5)
      [2](vscode-notebook-cell:?execution_count=40&line=2) query = "what does people say about hdmi cables?"
      [4](vscode-notebook-cell:?execution_count=40&line=4) # Perform the query
----> [5](vscode-notebook-cell:?execution_count=40&line=5) response = query_engine.query(query)
      [7](vscode-notebook-cell:?execution_count=40&line=7) # Print the response
      [8](vscode-notebook-cell:?execution_count=40&line=8) print(f"Response:")

File ~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:321, in Dispatcher.span.<locals>.wrapper(func, instance, args, kwargs)
    [318](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:318)             _logger.debug(f"Failed to reset active_span_id: {e}")
    [320](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:320) try:
--> [321](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:321)     result = func(*args, **kwargs)
    [322](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:322)     if isinstance(result, asyncio.Future):
    [323](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:323)         # If the result is a Future, wrap it
    [324](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py:324)         new_future = asyncio.ensure_future(result)

File ~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:52, in BaseQueryEngine.query(self, str_or_query_bundle)
     [50](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:50)     if isinstance(str_or_query_bundle, str):
     [51](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:51)         str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> [52](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:52)     query_result = self._query(str_or_query_bundle)
     [53](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:53) dispatcher.event(
     [54](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:54)     QueryEndEvent(query=str_or_query_bundle, response=query_result)
     [55](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:55) )
     [56](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py:56) return query_result
...
--> [603](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/match_service/transports/rest.py:603)     raise core_exceptions.from_http_response(response)
    [605](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/match_service/transports/rest.py:605) # Return the response
    [606](https://file+.vscode-resource.vscode-cdn.net/Users/naseemshah/coding/brandvoice-rag-test/notebooks/~/coding/brandvoice-rag-test/notebooks/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/match_service/transports/rest.py:606) resp = match_service.FindNeighborsResponse()

BadRequest: 400 POST https://1641463975.us-central1-556541786465.vdb.vertexai.goog/v1beta1/projects/556541786465/locations/us-central1/indexEndpoints/1505992280471764992:findNeighbors?%24alt=json%3Benum-encoding%3Dint: Could not find field "%24alt" in the type "google.cloud.aiplatform.v1beta1.FindNeighborsRequest".

Version

0.12.12

Steps to Reproduce

from llama_index.core import (
    StorageContext,
    Settings,
    VectorStoreIndex,
)
from llama_index.core.schema import TextNode
from llama_index.core.vector_stores.types import (
    MetadataFilters,
    MetadataFilter,
    FilterOperator,
)
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.vector_stores.vertexaivectorsearch import VertexAIVectorStore

vs_index = aiplatform.MatchingEngineIndex(index_name="<redacted>")

vs_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name="<redacted>"
)

embed_model = VertexTextEmbedding(
    model_name="text-multilingual-embedding-002",
    project=PROJECT_ID,
    location=REGION,
    credentials=credentials
)

vertex_gemini = Vertex(
    model="gemini-1.5-pro",
    context_window=100000,
    temperature=0,
    additional_kwargs={},
)

Settings.llm = vertex_gemini
Settings.embed_model = embed_model

nodes = []
for review in reviews:
    text = review['text']
    try:
        embedding = embed_model.get_text_embedding(text)
        if embedding is None:
            print(f"Embedding is None for Review ID: {review['id']}")
            continue
    except Exception as e:
        print(f"Error generating embedding for Review ID {review['id']}: {e}")
        continue
    metadata = {
        "review_id": review['id'],
        "listing_id": review["listing_id"],
        "review_rating": review["rating"],
        "review_is_verified": review["is_verified"],
        "review_language_code": review["language_code"],
    }
    node = TextNode(text=text, embedding=embedding, metadata=metadata)
    nodes.append(node)
print(f"Created {len(nodes)} TextNode objects with embeddings.")

vector_store = VertexAIVectorStore(
    project_id=PROJECT_ID,
    region=REGION,
    index_id=vs_index.resource_name,
    endpoint_id=vs_endpoint.resource_name,
    gcs_bucket_name=GCS_BUCKET_NAME,
    credentials_path="service-account.json"
)

vector_store.add(nodes)

vector_index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)

query_engine = vector_index.as_query_engine()

query = "what does people say about hdmi cables?"

response = query_engine.query(query)

print(f"Response:")
print("-" * 80)
print(response.response)
print("-" * 80)

Relevant Logs/Tracbacks

@naseemshah naseemshah added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jan 23, 2025
Copy link

dosubot bot commented Jan 23, 2025

Hey @naseemshah! How's it going? I'm here to help you with any bugs, questions, or contributions you might need assistance with. Let's tackle this together!

The BadRequest error you're encountering with the %24alt parameter in the findNeighbors API call seems to be related to an unrecognized parameter in the request. The %24alt is URL-encoded for $alt, which is often used in Google APIs to specify the response format, such as JSON. However, it appears that this parameter is not recognized or expected by the findNeighbors API in the context of your request.

Here are a few steps you can take to troubleshoot and potentially resolve this issue:

  1. Check API Documentation: Verify the expected parameters for the findNeighbors API call in the Google Cloud AI Platform documentation. Ensure that your request aligns with the documented requirements.

  2. Inspect Request Construction: Review the code where the API request is constructed in the VertexAIVectorStore class. Look for any hardcoded or default parameters that might be adding %24alt to the request. This could be happening in the utils.find_neighbors function or related utility functions.

  3. Update or Remove Parameters: If you find that %24alt is being added unnecessarily, try removing or updating this parameter in the request construction process.

  4. Library Version: Ensure that you are using a compatible version of the Google Cloud AI Platform library. Sometimes, updates or changes in the library can affect how requests are handled.

  5. Error Handling and Logging: Implement additional logging around the API request to capture more details about the request and response. This might provide further insights into what is causing the BadRequest error.

By following these steps, you should be able to identify and address the issue with the %24alt parameter in your API request.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@logan-markewich
Copy link
Collaborator

Probably the vertex vector store hasn't been updated and is out of date. I don't have access to test it, but would appreciate any help with a PR to update the source code

@naseemshah
Copy link
Author

@logan-markewich Thanks, I'll explore more to find the root cause and try to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants