Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Adds preset contentRegistry for IngestProcessors (#3281) #3317

Open
wants to merge 4 commits into
base: 2.x
Choose a base branch
from

Conversation

brianf-aws
Copy link
Contributor

Backport PR for add preset xContentRegistry to ingestProcessors for custom parametized local models

See: #3316

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…3281)

* add preset xContentRegistry to ingestProcessors for custom parametized local models

Curently local models that use the parameters map within the payload to create a request can not create objects to be used for local model prediction. This requires a opensearch core change because it needs the contentRegistry,however given there is not much dependency on the registry (currently) we can give it the preset registry given in the MachineLearningPlugin class vai the getNamedXContent() class

Signed-off-by: Brian Flores <[email protected]>

* Adds UT for proving models depend on xContentRegistry for prediction

Signed-off-by: Brian Flores <[email protected]>

* apply spotless

Signed-off-by: Brian Flores <[email protected]>

* Adds IT for Asymmetric Embedding scenario with MLInferenceIngestProcessor

We needed to make sure that a IT existed so that the preset content registry on the processor could work with parametized local models. By providing an IT that uses the asymetric embedding model its proven that the content registry is needed to create the embeddings. In this specific test case I used a ingest pipeline to convert passage embeddings, by simulating the pipeline to save test time.

Signed-off-by: Brian Flores <[email protected]>

---------

Signed-off-by: Brian Flores <[email protected]>
(cherry picked from commit df1b1ef)
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:43 — with GitHub Actions Error
Signed-off-by: Brian Flores <[email protected]>
@dhrubo-os
Copy link
Collaborator

Apply Spotless

@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 19:58 — with GitHub Actions Error
@dhrubo-os
Copy link
Collaborator

  2> REPRODUCE WITH: gradlew ':opensearch-ml-plugin:test' --tests "org.opensearch.ml.processor.MLInferenceIngestProcessorTests.testExecute_xContentRegistryNullWithLocalModel_throwsException" -Dtests.seed=8F68F36B61CF3E0C -Dtests.security.manager=false -Dtests.locale=uk-UA -Dtests.timezone=PRC -Druntime.java=11  2> 
  2> java.lang.AssertionError: expected:<this catch block should not get executed.> but was:<null>
        at __randomizedtesting.SeedInfo.seed([8F68F36B61CF3E0C:2EA1440C5D505F21]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at
        ```

The test passes on JAVA 21 but for some reason Java 11 has trouble passing in a message so it returns null when getMessage occurs so when Mockito tries to invoke getMessage it will get get a second NPE. and thus is why the exception is caught. This code change is to get around the gap that Java 11 has.

Signed-off-by: Brian Flores <[email protected]>
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 22:21 — with GitHub Actions Error
@mingshl
Copy link
Collaborator

mingshl commented Dec 31, 2024

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':opensearch-ml-plugin:spotlessJavaCheck'.
[Incubating] Problems report is available at: file:///D:/a/ml-commons/ml-commons/build/reports/problems/problems-report.html
> The following files had format violations:
      src\test\java\org\opensearch\ml\processor\MLInferenceIngestProcessorTests.java
          @@ -24,7 +24,6 @@
           import�org.junit.Assert;
           import�org.junit.Before;
           import�org.mockito.ArgumentCaptor;
          -import�org.mockito.ArgumentMatcher;
           import�org.mockito.Mock;
           import�org.mockito.Mockito;
           import�org.mockito.MockitoAnnotations;
          @@ -188,7 +187,7 @@
           ��������������������argThat(exception�->�exception�instanceof�NullPointerException�&&�exception.getMessage().equals(npeMessage))
           ����������������);
           ��������}�catch�(Exception�e)�{
          -//������������Java�11�doesn't�pass�the�message�correctly�resulting�in�an�anonymous�NPE�thus�this�getMessage()�results�in�null
          +������������//�Java�11�doesn't�pass�the�message�correctly�resulting�in�an�anonymous�NPE�thus�this�getMessage()�results�in�null
           ������������if�(e.getMessage()�!=�null)�{
           ����������������assertEquals("this�catch�block�should�not�get�executed.",�e.getMessage());
           ������������}
  Run 'gradlew.bat :opensearch-ml-plugin:spotlessApply' to fix these violations.


Signed-off-by: Brian Flores <[email protected]>
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 23:24 — with GitHub Actions Error
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval December 31, 2024 23:24 — with GitHub Actions Inactive
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 23:24 — with GitHub Actions Failure
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval December 31, 2024 23:24 — with GitHub Actions Inactive
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval December 31, 2024 23:24 — with GitHub Actions Error
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval December 31, 2024 23:25 — with GitHub Actions Inactive
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 9, 2025 16:59 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 9, 2025 16:59 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 9, 2025 16:59 — with GitHub Actions Failure
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval January 9, 2025 22:58 — with GitHub Actions Inactive
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 10, 2025 23:18 — with GitHub Actions Error
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 10, 2025 23:18 — with GitHub Actions Failure
@brianf-aws brianf-aws had a problem deploying to ml-commons-cicd-env-require-approval January 10, 2025 23:18 — with GitHub Actions Error
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval January 10, 2025 23:18 — with GitHub Actions Inactive
@brianf-aws brianf-aws temporarily deployed to ml-commons-cicd-env-require-approval January 10, 2025 23:18 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants