-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding draft for optimizing hybrid search blog post. #3503
Adding draft for optimizing hybrid search blog post. #3503
Conversation
Signed-off-by: wrigleyDan <[email protected]>
Signed-off-by: wrigleyDan <[email protected]>
Signed-off-by: wrigleyDan <[email protected]>
@wrigleyDan - Thanks, Dan. I am awaiting feedback from Stavros, and then will submit for final editorial review by our team editor. |
@pajuric Thanks! |
Signed-off-by: wrigleyDan <[email protected]>
@wrigleyDan - I added Stavros' final feedback in the form of comments. If you could incorporate those and let me know, we'll get this through reviews on Monday. |
Signed-off-by: wrigleyDan <[email protected]>
Signed-off-by: wrigleyDan <[email protected]>
@pajuric Thanks for adding the comments. They should now all be integrated together with most of the reviewdog check feedback I got through the automatic checks - those that I saw applicable. Technically I'm out until Jan 7 but I have an eye on incoming Github emails, so that we can get this done sooner rather than later. So let me know if there's anything else that needs to be done and I'll try to see to it as soon as possible. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wrigleyDan Editorial review complete. Please see my comments and changes and let me know if you have any questions. Thanks!
Cc: @pajuric
|
||
The currently planned next steps include replicating the approach with a dataset that has higher judgment coverage and covers a different domain to see its generalizability. | ||
|
||
Optimizing hybrid search typically is not the first step in search result quality optimization. Optimizing lexical search results first is especially important as the lexical search query is part of the hybrid search query. Bayesian optimization is an efficient technique to efficiently identify the best set of fields and field weights, sometimes also referred to as learning to boost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizing hybrid search typically is not the first step in search result quality optimization. Optimizing lexical search results first is especially important as the lexical search query is part of the hybrid search query. Bayesian optimization is an efficient technique to efficiently identify the best set of fields and field weights, sometimes also referred to as learning to boost. | |
Optimizing hybrid search is not typically the first step in search result quality optimization. Optimizing lexical search results first is especially important because the lexical search query is part of the hybrid search query. Bayesian optimization is an efficient technique for efficiently identifying the best set of fields and field weights, sometimes also referred to as "learning to boost." |
|
||
Optimizing hybrid search typically is not the first step in search result quality optimization. Optimizing lexical search results first is especially important as the lexical search query is part of the hybrid search query. Bayesian optimization is an efficient technique to efficiently identify the best set of fields and field weights, sometimes also referred to as learning to boost. | ||
|
||
The straightforward approach of trying out 66 different combinations can be created more elegantly by applying a technique like Bayesian optimization as well. In particular for large search indexes and a large amount of queries we expect this to result in a performance improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The straightforward approach of trying out 66 different combinations can be created more elegantly by applying a technique like Bayesian optimization as well. In particular for large search indexes and a large amount of queries we expect this to result in a performance improvement. | |
The straightforward approach of trying out 66 different combinations can be performed more elegantly by applying a technique like Bayesian optimization as well. In particular, we expect this to result in a performance improvement for large search indexes and large numbers of queries. |
|
||
The straightforward approach of trying out 66 different combinations can be created more elegantly by applying a technique like Bayesian optimization as well. In particular for large search indexes and a large amount of queries we expect this to result in a performance improvement. | ||
|
||
Reciprocal rank fusion is another way of combining lexical search and neural search, currently under active development: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reciprocal rank fusion is another way of combining lexical search and neural search, currently under active development: | |
Reciprocal rank fusion, currently under active development, is another way of combining lexical search and neural search: |
* [https://github.com/opensearch-project/neural-search/issues/865](https://github.com/opensearch-project/neural-search/issues/865) | ||
* [https://github.com/opensearch-project/neural-search/issues/659](https://github.com/opensearch-project/neural-search/issues/659) | ||
|
||
We also plan to include this technique, as well to identify the best way of running hybrid search dynamically per query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also plan to include this technique, as well to identify the best way of running hybrid search dynamically per query. | |
We also plan to include this technique and to identify the best way of running hybrid search dynamically per query. |
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
title: "Optimizing hybrid search in OpenSearch" | ||
authors: | ||
- dwrigley | ||
date: 2024-12-xx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update publish date to 2024-12-30
@nateynateynate - Please publish this on 12/30. Please let me know if you need a second maintainer to help push it. I can grab someone. |
Updates as per the latest review Signed-off-by: Daniel Wrigley <[email protected]>
change date, last change from editorial review Signed-off-by: Daniel Wrigley <[email protected]>
add feedback link to OpenSearch forum Signed-off-by: Daniel Wrigley <[email protected]>
@natebower Thanks for the review, I included the suggestions to my best knowledge. Let me know if I missed any. |
@wrigleyDan A couple last suggestions on rewrites resulting from my suggestions. Otherwise, it looks like all of my comments were addressed, so should be LGTM from my end 😄 |
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Daniel Wrigley <[email protected]>
Done, thanks again and Happy Holidays :) |
Thanks, Dan. I Have this scheduled for 12/30 to publish. Happy Holidays! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick change requested for the filename @wrigleyDan
…search-optimization.md Signed-off-by: Daniel Wrigley <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @wrigleyDan
Description
This PR adds a blog post draft as proposed in #3454 and as suggested by @pajuric
Issues Resolved
#3454
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.