Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG FIX] Add space type default and ef search parameter in warmup #276

Merged
merged 5 commits into from
Feb 10, 2022

Conversation

jmazanec15
Copy link
Member

Description

Warmup is used to load native library files into memory so that there are no initial latency penalties during search. The load process for search and warmup are very similar.

Currently, warmup api has a similar bug to #267. It does not default to a space type. For customers upgrading to 1.2 from ES 7.1 or 7.4 this will cause warmup failure. This only applies to 1.2. 1.1 and 1.0 do not have this bug in warmup.

Additionally, for nmslib, we allow the ef_search parameter to be set after indexing. To do this, we pass the parameter in during loading. In 1.2, we did not pass this parameter in. This will cause default ef_search to be used during warmup. To clean this up a bit, I abstracted generating load parameters into a utility function and used the same one for both search as well as warmup.

Issues Resolved

#255

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

Signed-off-by: John Mazanec <[email protected]>
For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 requested a review from a team February 9, 2022 23:13
* @param indexName Name of OpenSearch index that the segment files belong to
* @return load parameters that will be passed to the JNI.
*/
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be: getParametersAtLoading ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will update.

@@ -122,7 +122,8 @@ public void warmup() throws IOException {

for (FieldInfo fieldInfo : reader.getFieldInfos()) {
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) {
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE));
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to add a default value here to make more readable?

Suggested change
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
String spaceTypeDefaultVaule = SpaceType.L2.getValue();
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, spaceTypeDefaultVaule);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make default value a constant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I will update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine, but i am leaning towards what you have :) since method itself gives the hint that it is get or Default :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think I will just keep what I have. So, we do have SpaceType.DEFAULT, but in this case I do not want to use that because I know for sure that this should be L2. I will update with a comment.

* @return load parameters that will be passed to the JNI.
*/
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) {
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to mutate this map later? Seems that at the end we return mutable object from this method.

Suggested change
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of(
final Map<String, Object> loadParameters;
if (KNNEngine.NMSLIB.equals(knnEngine)) {
loadParameters = ImmutableMap.of(
SPACE_TYPE, spaceType.getValue(),
HNSW_ALGO_EF_SEARCH, KNNSettings.getEfSearchParam(indexName)
);
} else {
loadParameters = ImmutableMap.of(SPACE_TYPE, spaceType.getValue());
}
return loadParameters;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we can just return UnModifiableMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer the unmodifiable map approach. Will go with that.

@@ -122,7 +122,8 @@ public void warmup() throws IOException {

for (FieldInfo fieldInfo : reader.getFieldInfos()) {
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) {
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE));
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make default value a constant

@codecov-commenter
Copy link

codecov-commenter commented Feb 10, 2022

Codecov Report

Merging #276 (fd8bd5e) into main (18e6e35) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##               main     #276   +/-   ##
=========================================
  Coverage     83.38%   83.38%           
- Complexity      884      885    +1     
=========================================
  Files           127      127           
  Lines          3833     3834    +1     
  Branches        361      361           
=========================================
+ Hits           3196     3197    +1     
  Misses          475      475           
  Partials        162      162           
Impacted Files Coverage Δ
.../main/java/org/opensearch/knn/index/IndexUtil.java 56.66% <100.00%> (+3.93%) ⬆️
...n/java/org/opensearch/knn/index/KNNIndexShard.java 93.02% <100.00%> (+0.16%) ⬆️
.../main/java/org/opensearch/knn/index/KNNWeight.java 75.34% <100.00%> (-1.59%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18e6e35...fd8bd5e. Read the comment docs.

Copy link
Member

@VijayanB VijayanB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@jmazanec15 jmazanec15 added backport 1.2 label to add to PRs to auto backport backport 1.x labels Feb 10, 2022
@jmazanec15 jmazanec15 merged commit 2fb2ad1 into opensearch-project:main Feb 10, 2022
github-actions bot pushed a commit that referenced this pull request Feb 10, 2022
)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 2fb2ad1)
@github-actions
Copy link

The backport to 1.2 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-276-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2fb2ad116bc11bde3eab5695aed65392943c08ae
# Push it to GitHub
git push --set-upstream origin backport/backport-276-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2

Then, create a pull request where the base branch is 1.2 and the compare/head branch is backport/backport-276-to-1.2.

jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Feb 10, 2022
…pensearch-project#276)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 2fb2ad1)
jmazanec15 added a commit that referenced this pull request Feb 10, 2022
) (#285)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 2fb2ad1)
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 30, 2022
…pensearch-project#276)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.x backport 1.2 label to add to PRs to auto backport
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants