Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow override individual sub phase of Fetch phase #14546

Open
martin-gaievski opened this issue Jun 25, 2024 · 2 comments
Open
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc

Comments

@martin-gaievski
Copy link
Member

Is your feature request related to a problem? Please describe

Currently there is no way of customizing results of individual sub-phase of fetch phase. If I need to skip adding one sub-phase results or provide my own sub-phase that does some specific logic instead of existing sub-phase it's not possible.

Describe the solution you'd like

Ideally I need one of following:

  • override existing fetch sub-phase by registering new sub-phase with specific logic, and then system execute that new sub-phase instead of one that system has by default
  • execute new sub-phases in addition to those we have now, but guarantee the order of execution. In such way new sub-phase can override the result computed by core sub-phases.

Second approach has a drawback - if newly registered sub-phase does the job similar to one of existing phases then system will do double job and one from existing core sub-phases will be thrown away work.

For the context - when talking about new fetch sub-phase I mean existing extension point in SearchPlugin interface.

Related component

Search

Describe alternatives you've considered

There is no alternative as such

Additional context

Example of use case where this functionality is needed - we want to manipulate with results of inner_hits feature of the query. Today is no possible because InnerFetchPhase is always added and executed last after all other sub-phases (as per this code in FetchPhase)

@martin-gaievski martin-gaievski added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 25, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Jun 25, 2024
@mch2 mch2 removed the untriaged label Jun 26, 2024
@jed326
Copy link
Collaborator

jed326 commented Jun 26, 2024

Thanks for opening the issue @martin-gaievski!

If I need to skip adding one sub-phase results or provide my own sub-phase that does some specific logic instead of existing sub-phase it's not possible.

To clarify, you are able to provide your own sub-phase logic today with the existing SearchPlugin interface, but the problem you're facing is that you actually want to replace the existing InnerHitsPhase, is that correct?

As for your described solutions, I have similar concerns about the ordering approach, and beyond what you described in general I think what you are trying to achieve with ordering will make it so that each sub-phase will have to do a lot of type checking since it's functionality is actually tightly coupled with the order.

For your first solution, I'm also a little concerned about if it truly makes sense for a plugin to be able to override some of the other fetch subphases. For example why would we want to allow the FetchSourcePhase to be overridden? In terms of implementation this probably isn't too difficult though, we could make each fetch subphase have a unique name that is overridable by plugins, kind of similar to what I described in #13978 for the query contexts.

Given that it actually seems like you have a fairly narrow use case here, have you thought about if it would make sense to add an extension point into the InnerHitsPhase class itself instead of to all fetch sub phases? Admittedly I haven't looked to far into the implementation, but for example I see that the HighlightPhase takes in a pluggable list of Highlighters so would that model work better for your needs?

@jed326 jed326 removed their assignment Jun 26, 2024
@martin-gaievski
Copy link
Member Author

@jed326 I have a use case where the logic in InnerHits phase needs customization. In particular the main logic on getting hits and fetching their source works fine, what needs to be changed is the scores. Multiple processors in semantic search actually change scores (rerank, normalization), but if inner_hits are requested by user then scores for those inner hits will not be processed as per logic of those processors, which creates inconsistency in results.

At the glance this approach with customized providers for the InnerHits phase looks easier than generic mechanism of overriding the whole fetch sub-phase. Logic probably should be more sophisticated comparing to highlighters, mainly because we don't need to simple append results, but to override elements of results by providing custom values or set them empty.
It can be InnerHits providers for elements of individual hit, like score, max score, total hits etc. Or we get go one level higher and do custom provider for topdDocs, because this is what essentially is used to get most of the hit info: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/fetch/subphase/InnerHitsPhase.java#L90. Fundamental limitation is that while system allows customization of doc collector via query phase searcher, in the inner hits phase it always uses TopScoreDocsCollector, it's hardcoded in NestedQueryBuilder
Logic can be - check if there is a provider registered for an element of the hit, if yes - get info from it, otherwise use existing logic.
I don't think we need to override _source section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

3 participants