Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Duplicate search results in hybrid query when sorting is enabled #1044

Closed
martin-gaievski opened this issue Dec 25, 2024 · 0 comments
Closed
Labels
bug Something isn't working v2.19.0

Comments

@martin-gaievski
Copy link
Member

What is the bug?

In scenarios with multiple nodes, shards and replicas >= 1, search results of hybrid query with sorting do have duplicates or inconsistent records (same field has different values when returned in document source and sorting field sections).

I was able to replicate this with the cluster with 2 nodes, 5 primary and 1 replica shards. Search request has both sort and search_after.
Reproduced in latest 2.x (2.19 snapshot) and main.

Here is the search request:

GET /my_index/_search?search_pipeline=nlp-search-pipeline
{
    "size": 20,
    "query": {
        "hybrid": {
            "queries": [
                {
                    "bool": {
                        "must": [
                            {
                                "multi_match": {
                                    "query": "christmas",
                                    "fields": [
                                        "name^1.10",
                                        "name.keyword",
                                        "tags.name^1.05",
                                        "generatedTags.name"
                                    ]
                                }
                            }
                        ]
                    }
                },
                {
                    "knn": {
                        "embedding": {
                            "vector": [
                               <embedding_vector>
                            ],
                            "k": 10
                        }
                    }
                }
            ]
        }
    },
    "sort": [
        {
            "trendingScore": {
                "order": "desc"
            }
        }
    ],
    "_source": [
        "name",
        "trendingScore"
    ],
    "search_after": [
        709.0
    ]
}

example of response. Wrong data is in the last 8-th document, value of trendingScore is 0.0 when it's part of the source, and 3.0 when the same field is part of the sort section

{
    "took": 455,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 397,
            "relation": "eq"
        },
        "max_score": 0.8442519,
        "hits": [
            {
                "_index": "templates_prod",
                "_id": "bk2m5np8h467r92kmalxdcvft",
                "_score": null,
                "_source": {
                    "trendingScore": 303.125,
                    "name": "Summer Breeze Design"
                },
                "sort": [
                    303.125
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "bk2m6w28201xn8m3vb6mmdebn",
                "_score": null,
                "_source": {
                    "trendingScore": 303.125,
                    "name": "Winterfrost"
                },
                "sort": [
                    303.125
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2nhbsz900lwxe0xcpir0duc",
                "_score": null,
                "_source": {
                    "trendingScore": 19,
                    "name": "Sunshine Days - Nature Greeting"
                },
                "sort": [
                    19.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n8y4h900csue0x7ij8rwuj",
                "_score": null,
                "_source": {
                    "trendingScore": 13,
                    "name": "Mountain Vista Wedding Suite - Save the Date"
                },
                "sort": [
                    13.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2nau2a900e7ue0x2h3nheb9",
                "_score": null,
                "_source": {
                    "trendingScore": 10,
                    "name": "Ocean Waves Thank You"
                },
                "sort": [
                    10.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak305ic900fr3xe0xjuiin2nt",
                "_score": null,
                "_source": {
                    "trendingScore": 10,
                    "name": "Midnight Dreams - Elegant Celebration"
                },
                "sort": [
                    10.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n2r1t007au00xsn82pvg1",
                "_score": null,
                "_source": {
                    "trendingScore": 6,
                    "name": "Forest Pine - Rustic Wedding Invitation"
                },
                "sort": [
                    6.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n0pxe0060u00xxz0afuaa",
                "_score": null,
                "_source": {
                    "trendingScore": 0,
                    "name": "Modern Romance - Classic Black Wedding Invitation"
                },
                "sort": [
                    3.0
                ]
            }
        ]
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.19.0
Projects
Status: Done
Development

No branches or pull requests

1 participant