Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make function scores functions tunable #6955

Closed
clintongormley opened this issue Jul 22, 2014 · 9 comments
Closed

Make function scores functions tunable #6955

clintongormley opened this issue Jul 22, 2014 · 9 comments

Comments

@clintongormley
Copy link
Contributor

With the function_score query, each function can return a range of values: the decay functions return a value between 0 and 1, and field_value_factor function can return any value but with (eg) logarithmic functions a typical score is between 0 and 3, the random_score function currently returns a very large number (but see #6907), and the script_score can return whatever value you calculate.

It isn't easy to tune the contribution of each function. If you have two decay clauses: one for location and one for price, you can't easily say that "location is more important than price".

What about making each function accept the boost_factor parameter which is multiplied with the output of the function?

@clintongormley
Copy link
Contributor Author

/cc @brwe @rjernst

@nik9000
Copy link
Member

nik9000 commented Jul 22, 2014

Can you use multiple reviews as a temporary workaround for this? I haven't
checked the all API but I image so. If so it might be worth checking how
much this improves performance over the work around. Could be substantial
because multiple rescores has to sort between each.
On Jul 22, 2014 6:10 AM, "Lee Hinman" [email protected] wrote:

+1 on boost_factor for any sub function


Reply to this email directly or view it on GitHub
#6955 (comment)
.

@brwe
Copy link
Contributor

brwe commented Jul 23, 2014

@nik9000 I do not think there is a workaround currently which is unfortunate.

I think having the boost factor for each function would make a lot of sense.
Alternatively I though we could also do something like nest function_score in functions like in the below example and then use each function again together with the score computed inside the query. That would allow lots of flexibility but would be a json nightmare I guess...

@s1monw mentioned we could use expressions and expose many of the current functions as expression. That way everyone could write their custom function easily (and add whatever factor they wish) without performance loss.

But implementing a boost factor for each function seems like the easiest thing to do right now.

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "boost_factor": 2,
          "query": {
            "function_score": {
              "functions": [
                {
                  "gauss": {
                    "FIELD": {
                      "origin": "...",
                      "scale": "..."
                    }
                  }
                }
              ]
            }
          }
        },
        {
          "boost_factor": 2,
          "query": {
            "function_score": {
              "functions": [
                {...}
              ]
            }
          }
        }
      ]
    }
  }
}

(also cc @honzakral )

@brwe brwe self-assigned this Jul 25, 2014
@brwe
Copy link
Contributor

brwe commented Jul 25, 2014

If we implement the boost_factor thing, I think weight might be a better name for it.
Here is how this could look like:

{
  "query": {
    "function_score": {
      "functions": [
        {
          "exp": {
            ...
          },
          "weight": 2
        },
        {
          "random_score": {
            ...
          }, 
          "weight": 0.01
        }
      ]
    }
  }
}

If only one function is in there then this weight does not seem to make sense so I would not allow it.

@clintongormley
Copy link
Contributor Author

I like weight, although boost_factor wouldn't need it. I'm torn about supporting it if there is only one function. I understand that you should use boost there instead, but...

Hmmm, I think you're right.

@honzakral
Copy link
Contributor

I see the use case for single function - a lot of people have simple
queries that they then expand upon by adding more clauses. At that point it
would be simpler to allow this.

@brwe brwe removed the adoptme label Jul 25, 2014
@alexksikes
Copy link
Contributor

This seems to be an interesting use case until we get another use case. Why not more simply using mathematical expressions to express the different decay functions or a combination them? So I'm definitely +1 on exposing the various functions with Lucene expressions, and on implementing new ones.

@brwe
Copy link
Contributor

brwe commented Jul 29, 2014

The main reason why we added the decay functions as json with parameters was that not all users like scripting. So, maybe we should have both?

Adding the different functions we have now in function_score to expressions is not tied to function score so I think we should have a different issue for that.

@honzakral sorry, I was not clear with the single function, I was referring to the case where the score function is not in the function list, like this:

"function_score": {
    "boost_mode": "replace",
    "query": {...},
    "script_score/random/...": {
        ....
    }
}

Does not make sense to me to allow it there but I could do it.

@rjernst
Copy link
Member

rjernst commented Jul 31, 2014

I'm ok with adding a weight parameter to all functions.

brwe added a commit to brwe/elasticsearch that referenced this issue Aug 29, 2014
Weights can be defined per function like this:

```
"function_score": {
    "functions": [
        {
            "filter": {},
            "FUNCTION": {},
            "weight": number
        }
        ...
```
If `weight` is given without `FUNCTION` then `weight` behaves like `boost_factor`.
This commit deprecates `boost_factor`.

The following is valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
```

The following is not valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2,
      "FUNCTION(including boost_factor)": 2
    }
  }
}

POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2,
          "boost_factor": 2
        }
      ]
    }
  }
}
````

closes elastic#6955
@brwe brwe closed this as completed in c5ff70b Sep 1, 2014
brwe added a commit that referenced this issue Sep 1, 2014
Weights can be defined per function like this:

```
"function_score": {
    "functions": [
        {
            "filter": {},
            "FUNCTION": {},
            "weight": number
        }
        ...
```
If `weight` is given without `FUNCTION` then `weight` behaves like `boost_factor`.
This commit deprecates `boost_factor`.

The following is valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
```

The following is not valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2,
      "FUNCTION(including boost_factor)": 2
    }
  }
}

POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2,
          "boost_factor": 2
        }
      ]
    }
  }
}
````

closes #6955
closes #7137
brwe added a commit that referenced this issue Sep 8, 2014
Weights can be defined per function like this:

```
"function_score": {
    "functions": [
        {
            "filter": {},
            "FUNCTION": {},
            "weight": number
        }
        ...
```
If `weight` is given without `FUNCTION` then `weight` behaves like `boost_factor`.
This commit deprecates `boost_factor`.

The following is valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {},
          "FUNCTION": {},
          "weight": 2
        },
        ...
      ]
    }
  }
}
```

The following is not valid:

```
POST testidx/_search
{
  "query": {
    "function_score": {
      "weight": 2,
      "FUNCTION(including boost_factor)": 2
    }
  }
}

POST testidx/_search
{
  "query": {
    "function_score": {
      "functions": [
        {
          "weight": 2,
          "boost_factor": 2
        }
      ]
    }
  }
}
````

closes #6955
closes #7137
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants