Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function score: score_mode avg looks broken #19068

Closed
spinscale opened this issue Jun 24, 2016 · 10 comments
Closed

Function score: score_mode avg looks broken #19068

spinscale opened this issue Jun 24, 2016 · 10 comments
Labels
>docs General docs changes :Search/Search Search-related issues that do not fall into other categories

Comments

@spinscale
Copy link
Contributor

Elasticsearch version: 2.3.3

The function score with score_mode: avg looks broken:

PUT /foo/bar/1
{
  "name": "a b"
}

PUT /foo/bar/2
{
  "name": "b"
}

PUT /foo/bar/3
{
  "name": "a"
}

GET foo/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {
            "term": {
              "name": "a"
            }
          },
          "weight": 20
        },
        {
          "filter": {
            "term": {
              "name": "b"
            }
          },
          "weight": 10
        }
      ],
      "score_mode": "avg",
      "boost_mode": "replace"
    }
  }
}

All the scores are now 1

if you change to "score_mode": "sum", everything works as expected, with 10, 20 and 30 being the score.

@spinscale spinscale added the >bug label Jun 24, 2016
@polyfractal
Copy link
Contributor

Well, I see the problem, but I don't know enough to understand why it's being done this way. Perhaps not for a real reason and it's just incorrect?

When a WeightFactorFunction is scored, it multiplies the score (1 in this case, since it's a filter) by it's own weight. But when averaging the function scores, we explicitly check if the weight is a WeightFactorFunction, and if it is, divide by the weight. So net result is that each function with a weight divides out to a score of 1, which is why it isn't averaging the weights correctly.

I'm not sure why the WeightFactorFunction is special cased, but I think we can just remove the special case and increment the counter as normal?

@clintongormley clintongormley added the :Search/Search Search-related issues that do not fall into other categories label Jun 27, 2016
@clintongormley
Copy link
Contributor

@brwe What do you think?

@brwe
Copy link
Contributor

brwe commented Jun 28, 2016

avg computes the weighted average of the functions, see #8992 and #13732 for more explanation. You should be able to accomplish the averages of the values by having them returned by a script instead of using them as a weight.

@polyfractal
Copy link
Contributor

Ah, I see. Perhaps we should re-add a basic, unweighted average as avg, and rename the existing avg to weighted_avg? Or vice versa: add an unweighted_avg and leave the current one as avg?

You should be able to accomplish the averages of the values by having them returned by a script instead of using them as a weight.

I'm not sure this would work when you want to use more complex queries/filters? E.g. if docA matches (xyz AND abc) OR 123 , return a non-weighted score of 25.

@jpountz
Copy link
Contributor

jpountz commented Jun 28, 2016

I think Britta meant using a script_score function when she talked about using a script, rather than a script query. So something like this (not tested):

GET foo/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {
            "term": {
              "name": "a"
            }
          },
          "script_score": {
            "script": "20"
          }
        },
        {
          "filter": {
            "term": {
              "name": "b"
            }
          },
          "script_score": {
            "script": "10"
          }
        }
      ],
      "score_mode": "avg",
      "boost_mode": "replace"
    }
  }
}

@brwe
Copy link
Contributor

brwe commented Jun 28, 2016

I think Britta meant using a script_score

yes

Perhaps we should re-add a basic, unweighted average as avg, and rename the existing avg to weighted_avg? Or vice versa: add an unweighted_avg and leave the current one as avg

If we find we need an unweighted average then I'd prefer the latter option.

@polyfractal
Copy link
Contributor

Ah, I see. Thanks for the explanation :)

@clintongormley
Copy link
Contributor

This isn't the first time this has come up, perhaps it should be better explained in the docs? https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-function-score-query.html

@clintongormley clintongormley added >docs General docs changes and removed >bug labels Jun 29, 2016
@brwe
Copy link
Contributor

brwe commented Jun 29, 2016

yes, I opened #19154

@colings86
Copy link
Contributor

It seems like this was addressed in documentation in #19154 so I'll close this issue. If there is more to be done please reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

No branches or pull requests

6 participants