Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Implement pruning for neural sparse search #988

Merged
merged 19 commits into from
Dec 18, 2024

Conversation

zhichao-aws
Copy link
Member

Description

Implement prune for sparse vectors, to save disk space and accelerate search speed with small loss on search relevance. #946

  • Implement pruning at sparse_encoding ingestion processor. Users can configure the pruning strategy when create the processor, and the processor will prune the sparse vectors before write to index.
  • Implement pruning at neural_sparse 2-phase search. Users can configure the pruning strategy when search with neural_sparse query. The query builder will prune the query before search on index.

Related Issues

#946

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@zhichao-aws
Copy link
Member Author

This PR is ready for review now

Copy link
Collaborator

@heemin32 heemin32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide an overview of how the overall API will look? I initially thought this change would only affect the query side, but it seems it will also modify the parameters for neural_sparse_two_phase_processor.

Additionally, the current implementation appears to be focused on two-phase processing with different strategies for splitting vectors, rather than a combination of pruning and two-phase processing?

@zhichao-aws
Copy link
Member Author

zhichao-aws commented Nov 21, 2024

Could you provide an overview of how the overall API will look? I initially thought this change would only affect the query side, but it seems it will also modify the parameters for neural_sparse_two_phase_processor.

Based on our benchmark results in #946 , when searching, applying prune to 2-phase search has superseded applying it to neural sparse query body, on both precision and latency. Therefore, enhancing the existing 2-phase search pipeline makes more sense.
To maintain compatibility with existing APIs, the overall API will look like:

# ingestion pipeline
PUT /_ingest/pipeline/sparse-pipeline
{
    "description": "Calling sparse model to generate expanded tokens",
    "processors": [
        {
            "sparse_encoding": {
                "model_id": "fousVokBjnSupmOha8aN",
                "pruning_type": "alpha_mass",
                "pruning_ratio": 0.8,
                "field_map": {
                    "body": "body_sparse"
                },
            }
        }
    ]
}

# two phase pipeline
PUT /_search/pipeline/neural_search_pipeline
{
  "request_processors": [
    {
      "neural_sparse_two_phase_processor": {
        "tag": "neural-sparse",
        "description": "Creates a two-phase processor for neural sparse search.",
        "pruning_type": "alpha_mass",
        "pruning_ratio": 0.8,
      }
    }
  ]
}

Additionally, the current implementation appears to be focused on two-phase processing with different strategies for splitting vectors, rather than a combination of pruning and two-phase processing?

The existing two-phase use max_ratio prune criteria. And now we add supports for other criteria as well

@zhichao-aws zhichao-aws changed the title [Feature] Implement pruning for neural sparse search [Enhancement] Implement pruning for neural sparse search Nov 22, 2024
Copy link

codecov bot commented Nov 22, 2024

Codecov Report

Attention: Patch coverage is 96.85535% with 5 lines in your changes missing coverage. Please review.

Project coverage is 81.27%. Comparing base (3c7f275) to head (7486ee8).

Files with missing lines Patch % Lines
...opensearch/neuralsearch/util/prune/PruneUtils.java 96.80% 2 Missing and 1 partial ⚠️
...earch/processor/NeuralSparseTwoPhaseProcessor.java 94.11% 0 Missing and 1 partial ⚠️
...h/neuralsearch/query/NeuralSparseQueryBuilder.java 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #988      +/-   ##
============================================
+ Coverage     80.47%   81.27%   +0.79%     
- Complexity     1000     1054      +54     
============================================
  Files            78       80       +2     
  Lines          3411     3535     +124     
  Branches        578      611      +33     
============================================
+ Hits           2745     2873     +128     
+ Misses          425      423       -2     
+ Partials        241      239       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zhichao-aws zhichao-aws requested a review from heemin32 November 22, 2024 07:18
Copy link
Collaborator

@heemin32 heemin32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from minor comment, why this PR is trying to merge into main?
If this changes API that used to define the processor, it should be checked with application security and for that we need to merge to feature branch in main repo, and only after that's cleared from feature branch to main.

);
} else {
// if we don't have prune type, then prune ratio field must not have value
if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge this if with a previous else and have one single else if block

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge this if with a previous else and have one single else if block

ack

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

We want to validate that the PRUNE_RATIO field is not provided. Any values will be illegal


switch (pruneType) {
case TOP_K:
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);
return pruneRatio > 0 && pruneRatio == Math.rint(pruneRatio);

this is more reliable for float numbers, otherwise there is a chance of false positive

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give an example of false positive?

}
}

switch (pruneType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, can we use map instead of switch?

@zhichao-aws
Copy link
Member Author

@martin-gaievski Thanks for the comments. We didn't create feature branch because there is no other contributors working on this and we regard the PR branch as feature branch.

I'm on PTO this week, will follow the app sec issue and solve the comments next week.


switch (pruneType) {
case TOP_K:
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.

* @param pruneType The type of prune strategy
* @throws IllegalArgumentException if prune type is null
*/
public static String getValidPruneRatioDescription(PruneType pruneType) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] this can be refactored to a static map.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the discussion with Martin at above

);
} else {
// if we don't have prune type, then prune ratio field must not have value
if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.

Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
@martin-gaievski martin-gaievski added backport 2.x Label will add auto workflow to backport PR to 2.x branch enhancement v2.19.0 labels Dec 17, 2024
Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let minor comment. Relaxing the potential merge blocker, looks like from app sec point of view the risk of this enhancement is low

@martin-gaievski martin-gaievski dismissed their stale review December 17, 2024 17:28

App security flagged this as low risk change, removing the blocker

@zhichao-aws zhichao-aws merged commit e8fe284 into opensearch-project:main Dec 18, 2024
39 checks passed
@zhichao-aws zhichao-aws added backport 2.x Label will add auto workflow to backport PR to 2.x branch and removed backport 2.x Label will add auto workflow to backport PR to 2.x branch labels Dec 18, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-988-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e8fe2847a5237a03edd414a333799f7a5d2d8c7d
# Push it to GitHub
git push --set-upstream origin backport/backport-988-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-988-to-2.x.

zhichao-aws added a commit that referenced this pull request Dec 18, 2024
* add impl

Signed-off-by: zhichao-aws <[email protected]>

* add UT

Signed-off-by: zhichao-aws <[email protected]>

* rename pruneType; UT

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* ut

Signed-off-by: zhichao-aws <[email protected]>

* add it

Signed-off-by: zhichao-aws <[email protected]>

* change on 2-phase

Signed-off-by: zhichao-aws <[email protected]>

* UT

Signed-off-by: zhichao-aws <[email protected]>

* it

Signed-off-by: zhichao-aws <[email protected]>

* rename

Signed-off-by: zhichao-aws <[email protected]>

* enhance: more detailed error message

Signed-off-by: zhichao-aws <[email protected]>

* refactor to prune and split

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* fix UT cov

Signed-off-by: zhichao-aws <[email protected]>

* address review comments

Signed-off-by: zhichao-aws <[email protected]>

* enlarge score diff range

Signed-off-by: zhichao-aws <[email protected]>

* address comments: check lowScores non null instead of flag

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
(cherry picked from commit e8fe284)
zhichao-aws added a commit that referenced this pull request Dec 19, 2024
* [Enhancement] Implement pruning for neural sparse search (#988)

* add impl

Signed-off-by: zhichao-aws <[email protected]>

* add UT

Signed-off-by: zhichao-aws <[email protected]>

* rename pruneType; UT

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* ut

Signed-off-by: zhichao-aws <[email protected]>

* add it

Signed-off-by: zhichao-aws <[email protected]>

* change on 2-phase

Signed-off-by: zhichao-aws <[email protected]>

* UT

Signed-off-by: zhichao-aws <[email protected]>

* it

Signed-off-by: zhichao-aws <[email protected]>

* rename

Signed-off-by: zhichao-aws <[email protected]>

* enhance: more detailed error message

Signed-off-by: zhichao-aws <[email protected]>

* refactor to prune and split

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* fix UT cov

Signed-off-by: zhichao-aws <[email protected]>

* address review comments

Signed-off-by: zhichao-aws <[email protected]>

* enlarge score diff range

Signed-off-by: zhichao-aws <[email protected]>

* address comments: check lowScores non null instead of flag

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
(cherry picked from commit e8fe284)

* fix toList for jvm version

Signed-off-by: zhichao-aws <[email protected]>

* adapt for the gap of batch ingest between 2.x main

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
zhichao-aws added a commit to zhichao-aws/neural-search that referenced this pull request Jan 6, 2025
…project#988)

* add impl

Signed-off-by: zhichao-aws <[email protected]>

* add UT

Signed-off-by: zhichao-aws <[email protected]>

* rename pruneType; UT

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* ut

Signed-off-by: zhichao-aws <[email protected]>

* add it

Signed-off-by: zhichao-aws <[email protected]>

* change on 2-phase

Signed-off-by: zhichao-aws <[email protected]>

* UT

Signed-off-by: zhichao-aws <[email protected]>

* it

Signed-off-by: zhichao-aws <[email protected]>

* rename

Signed-off-by: zhichao-aws <[email protected]>

* enhance: more detailed error message

Signed-off-by: zhichao-aws <[email protected]>

* refactor to prune and split

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* fix UT cov

Signed-off-by: zhichao-aws <[email protected]>

* address review comments

Signed-off-by: zhichao-aws <[email protected]>

* enlarge score diff range

Signed-off-by: zhichao-aws <[email protected]>

* address comments: check lowScores non null instead of flag

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
heemin32 pushed a commit to heemin32/neural-search that referenced this pull request Jan 9, 2025
* [Enhancement] Implement pruning for neural sparse search (opensearch-project#988)

* add impl

Signed-off-by: zhichao-aws <[email protected]>

* add UT

Signed-off-by: zhichao-aws <[email protected]>

* rename pruneType; UT

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* ut

Signed-off-by: zhichao-aws <[email protected]>

* add it

Signed-off-by: zhichao-aws <[email protected]>

* change on 2-phase

Signed-off-by: zhichao-aws <[email protected]>

* UT

Signed-off-by: zhichao-aws <[email protected]>

* it

Signed-off-by: zhichao-aws <[email protected]>

* rename

Signed-off-by: zhichao-aws <[email protected]>

* enhance: more detailed error message

Signed-off-by: zhichao-aws <[email protected]>

* refactor to prune and split

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* fix UT cov

Signed-off-by: zhichao-aws <[email protected]>

* address review comments

Signed-off-by: zhichao-aws <[email protected]>

* enlarge score diff range

Signed-off-by: zhichao-aws <[email protected]>

* address comments: check lowScores non null instead of flag

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
(cherry picked from commit e8fe284)

* fix toList for jvm version

Signed-off-by: zhichao-aws <[email protected]>

* adapt for the gap of batch ingest between 2.x main

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
martin-gaievski pushed a commit that referenced this pull request Jan 10, 2025
* add impl

Signed-off-by: zhichao-aws <[email protected]>

* add UT

Signed-off-by: zhichao-aws <[email protected]>

* rename pruneType; UT

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* ut

Signed-off-by: zhichao-aws <[email protected]>

* add it

Signed-off-by: zhichao-aws <[email protected]>

* change on 2-phase

Signed-off-by: zhichao-aws <[email protected]>

* UT

Signed-off-by: zhichao-aws <[email protected]>

* it

Signed-off-by: zhichao-aws <[email protected]>

* rename

Signed-off-by: zhichao-aws <[email protected]>

* enhance: more detailed error message

Signed-off-by: zhichao-aws <[email protected]>

* refactor to prune and split

Signed-off-by: zhichao-aws <[email protected]>

* changelog

Signed-off-by: zhichao-aws <[email protected]>

* fix UT cov

Signed-off-by: zhichao-aws <[email protected]>

* address review comments

Signed-off-by: zhichao-aws <[email protected]>

* enlarge score diff range

Signed-off-by: zhichao-aws <[email protected]>

* address comments: check lowScores non null instead of flag

Signed-off-by: zhichao-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch enhancement v2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants