Package queryparser parses a Google-like search string to return a Query object.
The resulting Query object can be used as an interim representation between user input and something like ElasticSearch's Query DSL.
A simple example:
package main
import (
"fmt"
"github.com/mergermarket/queryparser"
"strings"
)
func main() {
query, _ := queryparser.NewParser(strings.NewReader(`(apple OR pear) AND (pie OR crumble)`)).Parse()
fmt.Println(query)
}
output:
{Occur:MUST, SubQueries:[{Occur:SHOULD, Terms:["apple","pear"]},{Occur:SHOULD, Terms:["pie","crumble"]}]}
Query is a simple but hopefully useful model of the intention behind the parsed string.
- Occur: how to determine whether a document matches the query.
- MUST = all query components must match.
- SHOULD = at least one query component must match.
- Terms: zero or more words or phrases
- SubQueries: zero or more Query objects. So yeah wow like recursive, man.
The parser can handle arbitrary combinations of:
- single word
- phrase surrounded by quotes
- any of the above combined with boolean operators "OR" and "AND"
- any of the above grouped with parentheses
Query component | Example | Intention |
---|---|---|
single word | kayak | Match documents containing the word kayak |
multiple words | kayak hammer | Match documents containing both kayak and hammer , in any order, not necessarily together. |
" " | "roof slate" | Match documents containing the exact phrase roof slate . |
( ) | (baboon OR warthog) snake | Match documents containing at least one of baboon or warthog , and also snake . |
AND | book AND pdf | Match both query components. Equivalent to book pdf |
OR | tree OR shrub | Match either query component. |
Look at the parser tests for more examples.
Operators have the following precedence, in descending order:
Operator |
---|
( ) |
AND |
OR |
For example, because AND has higher precedence than OR, the following queries:
apple AND pear OR rhubarb
apple OR pear AND rhubarb
evaluate as if written as follows:
(apple AND pear) OR rhubarb
apple OR (pear AND rhubarb)
Not yet supported:
- "foo NOT bar" match only documents that contain "foo" but not "bar"
- "-bar" match only documents that do not contain "bar"