Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Documentation for dynamic query parameters for kNN search request #7761

Merged
merged 12 commits into from
Jul 22, 2024
53 changes: 51 additions & 2 deletions _search-plugins/knn/approximate-knn.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ The following table provides examples of the number of results returned by vario
10 | 1 | 1 | 4 | 4 | 1
10 | 10 | 1 | 4 | 10 | 10
10 | 1 | 2 | 4 | 8 | 2

The number of results returned by Faiss/NMSLIB differs from the number of results returned by Lucene only when `k` is smaller than `size`. If `k` and `size` are equal, all engines return the same number of results.

Starting in OpenSearch 2.14, you can use `k`, `min_score`, or `max_distance` for [radial search]({{site.url}}{{site.baseurl}}/search-plugins/knn/radial-search-knn/).
Expand Down Expand Up @@ -253,7 +253,56 @@ POST _bulk
...
```

After data is ingested, it can be search just like any other `knn_vector` field!
After data is ingested, it can be searched in the same way as any other `knn_vector` field.

### Additional query parameters
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

Starting with version 2.16, you can provide `method_parameters` in a search request:

```json
GET my-knn-index-1/_search
{
"size": 2,
"query": {
"knn": {
"my_vector2": {
"vector": [2, 3, 5, 6],
"k": 2,
"method_parameters" : {
"ef_search": 100
}
}
}
}
}
```
shatejas marked this conversation as resolved.
Show resolved Hide resolved
{% include copy-curl.html %}

These parameters are dependent on the combination of engine and method used to create the index. The following sections provide information about the supported `method_parameters`.

#### `ef_search`

kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies the number of vectors to examine in order to find the top k nearest neighbors. Higher `ef_search` values improve recall at the cost of increased search latency. The value must be positive.

The following table provides information about the `ef_search` parameter for the supported engines.

Engine | Radial query support | Notes
:--- | :--- | :---
`nmslib` | No | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting.
`faiss` | Yes | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting.
`lucene` | No | When creating a search query, you must specify `k`. If you provide both `k` and `ef_search`, then the larger value is passed to the engine. If `ef_search` is larger than `k`, you can provide the `size` parameter to limit the final number of results to `k`.

#### `nprobes`

kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

You can provide the `nprobes` parameter when searching an index created using the `ivf` method. The `nprobes` parameter specifies the number of `nprobes` clusters to examine in order to find the top k nearest neighbors. Higher `nprobes` values improve recall at the cost of increased search latency. The value must be positive.

The following table provides information about the `nprobes` parameter for the supported engines.

Engine | Notes
:--- | :---
`faiss` | If `nprobes` is present in a query, it overrides the value provided when creating the index.

### Using approximate k-NN with filters

Expand Down
Loading