Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support one_to_one in ML Inference Search Response Processor #2801

Merged
merged 6 commits into from
Aug 21, 2024

Conversation

mingshl
Copy link
Collaborator

@mingshl mingshl commented Aug 2, 2024

Description

Many document to one prediction is the default config in ML Inference Search Response Processor. This PR adds one document to one prediction support following up with #2688

in 2.16, ml inference search response processor support collect documents into a list of model input and make one prediction call.

for example,

the search response returns as following, and we would like to use the text field in the document to send to model input.

{
  "took": 100,
  "timed_out": false,
  "_shards": {
    "total": 22,
    "successful": 22,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "gte"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": " this is document 1"
      },
     {
            "_index": "test_index",
            "_id": "2",
            "_score": 1,
            "_source": {
              "text": " this is document 2"
          } 
      ] 
}

one_to_one inference is false setting will combine the two document containing the field text into ["this is document 1", "this is document 2"] and this is one round of prediction.

in 2.17, we support one_to_one inference is true setting, that will make two rounds of prediction, the first prediction, we send "this is document 1" and the second prediction we send "this is document 2".

when users want to use a model that accept a list of model input, for example, using openai embedding model, we can set one_to_one is false,

when users want to use a model that accept a single string model input, for example, using bedrock embedding model, we can set one_to_one is true. The most common use case for one-to-one inference is rerank use case using xgboost, which usually take a single document and compare with the search string and return a single score.

Related Issues

#2173
#2444
#2839

#2879

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@mingshl
Copy link
Collaborator Author

mingshl commented Aug 3, 2024

We need to bump the version in the build.gradle file to let the CI runs.

> Could not resolve all dependencies for configuration ':opensearch-ml-plugin:runtimeClasspath'.
   > Conflict found for the following module:
       - org.apache.httpcomponents.core5:httpcore5 between versions 5.2.5 and 5.2.2
       - 

} catch (Exception e) {
if (ignoreFailure) {
responseListener.onResponse(response);
} else {
responseListener.onFailure(e);
responseListener.onFailure(new RuntimeException(e.getMessage()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why encapsulate the original exception to RuntimeException?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid 500x error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting 5x error in the first place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest use OpenSearchStatusException to replace RuntimeException.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see 5xx exception, it's for precautions. changed to OpenSearchStatusException similar to

throw new OpenSearchStatusException("Failed to search index " + indexName, RestStatus.BAD_REQUEST);

added in commit use OpenSearchStatusException in error handling

@dhrubo-os
Copy link
Collaborator

dhrubo-os commented Aug 12, 2024

Thanks for the explanation. IMO, one_to_one doesn't not seem intuitive. May be single_document_inference? Or document_wise_inference?

in 2.16, we support one_to_one inference is true setting,

So we already released this one_to_one inference in 2.16?

Where do we save this setting?

@mingshl
Copy link
Collaborator Author

mingshl commented Aug 12, 2024

Thanks for the explanation. IMO, one_to_one doesn't not seem intuitive. May be single_document_inference? Or document_wise_inference?

in 2.16, we support one_to_one inference is true setting,

So we already released this one_to_one inference in 2.16?

Where do we save this setting?

aw ha, it should be 2.17, yeah this change is going to add to 2.17. just modified the description.

@mingshl
Copy link
Collaborator Author

mingshl commented Aug 13, 2024

Thanks for the explanation. IMO, one_to_one doesn't not seem intuitive. May be single_document_inference? Or document_wise_inference?

in 2.16, we support one_to_one inference is true setting,

So we already released this one_to_one inference in 2.16?

Where do we save this setting?

in the ml search response processor, we already had it in 2.16 but it's not supported to turn on to true yet.

@dhrubo-os
Copy link
Collaborator

in the ml search response processor, we already had it in 2.16 but it's not supported to turn on to true yet.

can we change the settings name? or we need to stick with one_to_one?

Zhangxunmt
Zhangxunmt previously approved these changes Aug 13, 2024
@mingshl
Copy link
Collaborator Author

mingshl commented Aug 13, 2024

in the ml search response processor, we already had it in 2.16 but it's not supported to turn on to true yet.

can we change the settings name? or we need to stick with one_to_one?

the name is added since 2.16, if we change the name in 2.17, it might case confusion.

@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 18, 2024 23:29 — with GitHub Actions Failure
@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 18, 2024 23:29 — with GitHub Actions Failure
Signed-off-by: Mingshi Liu <[email protected]>
@mingshl mingshl force-pushed the main-one-to-one-inference branch from 5b5d79f to f1aa03f Compare August 19, 2024 18:28
@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 19, 2024 18:28 — with GitHub Actions Failure
@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 19, 2024 18:29 — with GitHub Actions Failure
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 19, 2024 20:40 — with GitHub Actions Inactive
@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 19, 2024 22:29 — with GitHub Actions Failure
@mingshl mingshl had a problem deploying to ml-commons-cicd-env August 20, 2024 00:27 — with GitHub Actions Failure
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 20, 2024 02:33 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 20, 2024 03:44 — with GitHub Actions Inactive
@mingshl
Copy link
Collaborator Author

mingshl commented Aug 20, 2024

All tests passed

@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 21, 2024 04:32 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 21, 2024 04:32 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 21, 2024 05:42 — with GitHub Actions Inactive
@mingshl mingshl merged commit 2a33c65 into opensearch-project:main Aug 21, 2024
7 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 21, 2024
* add one document to one prediction support

Signed-off-by: Mingshi Liu <[email protected]>

* rephrase javadoc

Signed-off-by: Mingshi Liu <[email protected]>

* use OpenSearchStatusException in error handling

Signed-off-by: Mingshi Liu <[email protected]>

* fix message

Signed-off-by: Mingshi Liu <[email protected]>

* add more tests

Signed-off-by: Mingshi Liu <[email protected]>

* handle different exceptions properly

Signed-off-by: Mingshi Liu <[email protected]>

---------

Signed-off-by: Mingshi Liu <[email protected]>
(cherry picked from commit 2a33c65)
mingshl added a commit that referenced this pull request Aug 27, 2024
…2843)

* add one document to one prediction support

Signed-off-by: Mingshi Liu <[email protected]>

* rephrase javadoc

Signed-off-by: Mingshi Liu <[email protected]>

* use OpenSearchStatusException in error handling

Signed-off-by: Mingshi Liu <[email protected]>

* fix message

Signed-off-by: Mingshi Liu <[email protected]>

* add more tests

Signed-off-by: Mingshi Liu <[email protected]>

* handle different exceptions properly

Signed-off-by: Mingshi Liu <[email protected]>

---------

Signed-off-by: Mingshi Liu <[email protected]>
(cherry picked from commit 2a33c65)

Co-authored-by: Mingshi Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants