-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG FIX] Add space type default and ef search parameter in warmup #276
[BUG FIX] Add space type default and ef search parameter in warmup #276
Conversation
Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. Signed-off-by: John Mazanec <[email protected]>
For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
* @param indexName Name of OpenSearch index that the segment files belong to | ||
* @return load parameters that will be passed to the JNI. | ||
*/ | ||
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be: getParametersAtLoading ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Will update.
@@ -122,7 +122,8 @@ public void warmup() throws IOException { | |||
|
|||
for (FieldInfo fieldInfo : reader.getFieldInfos()) { | |||
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) { | |||
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE)); | |||
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about to add a default value here to make more readable?
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); | |
String spaceTypeDefaultVaule = SpaceType.L2.getValue(); | |
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, spaceTypeDefaultVaule); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make default value a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, I will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way is fine, but i am leaning towards what you have :) since method itself gives the hint that it is get or Default :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think I will just keep what I have. So, we do have SpaceType.DEFAULT, but in this case I do not want to use that because I know for sure that this should be L2. I will update with a comment.
* @return load parameters that will be passed to the JNI. | ||
*/ | ||
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) { | ||
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to mutate this map later? Seems that at the end we return mutable object from this method.
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of( | |
final Map<String, Object> loadParameters; | |
if (KNNEngine.NMSLIB.equals(knnEngine)) { | |
loadParameters = ImmutableMap.of( | |
SPACE_TYPE, spaceType.getValue(), | |
HNSW_ALGO_EF_SEARCH, KNNSettings.getEfSearchParam(indexName) | |
); | |
} else { | |
loadParameters = ImmutableMap.of(SPACE_TYPE, spaceType.getValue()); | |
} | |
return loadParameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we can just return UnModifiableMap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer the unmodifiable map approach. Will go with that.
@@ -122,7 +122,8 @@ public void warmup() throws IOException { | |||
|
|||
for (FieldInfo fieldInfo : reader.getFieldInfos()) { | |||
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) { | |||
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE)); | |||
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make default value a constant
Signed-off-by: John Mazanec <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #276 +/- ##
=========================================
Coverage 83.38% 83.38%
- Complexity 884 885 +1
=========================================
Files 127 127
Lines 3833 3834 +1
Branches 361 361
=========================================
+ Hits 3196 3197 +1
Misses 475 475
Partials 162 162
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <[email protected]> (cherry picked from commit 2fb2ad1)
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-276-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2fb2ad116bc11bde3eab5695aed65392943c08ae
# Push it to GitHub
git push --set-upstream origin backport/backport-276-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2 Then, create a pull request where the |
…pensearch-project#276) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <[email protected]> (cherry picked from commit 2fb2ad1)
) (#285) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <[email protected]> (cherry picked from commit 2fb2ad1)
…pensearch-project#276) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <[email protected]>
Description
Warmup is used to load native library files into memory so that there are no initial latency penalties during search. The load process for search and warmup are very similar.
Currently, warmup api has a similar bug to #267. It does not default to a space type. For customers upgrading to 1.2 from ES 7.1 or 7.4 this will cause warmup failure. This only applies to 1.2. 1.1 and 1.0 do not have this bug in warmup.
Additionally, for nmslib, we allow the ef_search parameter to be set after indexing. To do this, we pass the parameter in during loading. In 1.2, we did not pass this parameter in. This will cause default ef_search to be used during warmup. To clean this up a bit, I abstracted generating load parameters into a utility function and used the same one for both search as well as warmup.
Issues Resolved
#255
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.