Skip to content

Typeahead Search Completion Suggestions

Matthew Boynes edited this page Apr 18, 2020 · 1 revision

Typeahead Search Suggestions

SearchPress 0.4 includes a typeahead search suggestions API which uses Elasticsearch's Completion Suggester feature behind-the-scenes. This does require some custom development to implement, but SearchPress lays a foundation to make this very simple.

Getting Started

By default, the search suggestions API is disabled in SearchPress. To enable it, you need to filter sp_enable_search_suggest and return true:

add_filter( 'sp_enable_search_suggest', '__return_true' );

The search suggestions API makes mapping and indexing changes, so if your content is already indexed, you'll need to update your mapping and reindex your content.

Once your content is indexed, the search suggestions API is ready to go! SearchPress registers a REST API endpoint at /wp-json/searchpress/v1/suggest/<search fragment> which your website can query for search suggestions.

GET https://mysite.test/wp-json/searchpress/v1/suggest/anti

[
  {
    "text": "Antidisestablishmentarianism",
    "_score": 1,
    "_source": {
      "post_title": "Antidisestablishmentarianism",
      "post_id": 1175,
      "permalink": "https://mysite.test/2009/10/05/title-should-not-overflow-the-content-area/"
    }
  }
]

Pros and Cons

Elasticsearch's Completion Suggester is built for speed, to be able to respond to search queries as fast as a user types. Single-digit milliseconds. In order to optimize for speed, it has to make some accommodations. To be pretty candid about it, the downsides are significant, and you might decide that the Completion Suggester is not a viable solution for your use case. That doesn't mean you can't have search-as-you-type functionality, it just means you have to take a different route, like a different suggester or add a field to your mapping that uses an n-gram tokenizer or token filter.

From our perspective, there are two major drawbacks to the Completion Suggester:

  1. Elasticsearch doesn't tokenize completion suggester fields like other content; how it indexes data severely lacks flexibility. The most significant way this will impact you is that Elasticsearch only matches from the beginning of the field (as if you were doing a prefix search). That is, if you have a post titled "Hello World," and the user types "world", that post will not be suggested with the out-of-the-box configuration. It will be suggested for searches "h", "he", "hello", "hello w", etc. To get around this limitation, you might want to index your searchable fields in different orders. For instance, if you are indexing authors, you might want to index names as both first-last and last-first, like "Kurt Vonnegut" and "Vonnegut Kurt".
  2. Completion data is indexed differently from the rest of the document. Therefore, you can't decide to add a new field to what is searchable without having to reindex all documents. Further, suggest data is stored in memory, so you wouldn't want to add large fields (e.g. post_content). This is an important consideration when trying to work around the drawback noted above; you wouldn't want to permute through all possible word orders in post titles during indexing.

While the Completion Suggester is impossibly fast, it's worth acknowledging that SearchPress wraps this API with a WordPress REST API endpoint, and it will be significantly slower as a result. How much slower will depend entirely on your site and how much functionality gets bootstrapped, but our experience is that it adds a minimum of 100ms. If you want faster responses, you might consider bypassing WordPress for these transactions altogether, and instead using an ultralight middleware, like a very basic node app that does little more than proxy search requests.

Customizations

By default, only post titles are indexed for search suggestions, and only the post title, post ID, and permalink are returned in the REST API response. All of this is customizable.

Indexing

When this feature is enabled, SearchPress adds a search_suggest field of the type completion. To modify this, filter the mapping using the sp_config_mapping filter above priority 5.

At index time, SearchPress first determines if a post should get suggest data using the sp_search_suggest_post_is_searchable filter. If the post should get data added, SearchPress then filters the array of input data it will send Elasticsearch using the sp_search_suggest_data filter. By default, this is just the post's post_title. Add any additional data you want searchable using this filter.

Searching

At search time, SearchPress provides two key filters to modify searches and responses. The first, sp_search_suggest_query, filters the raw Elasticsearch query. You can use this to modify any aspects of the query, including the search string itself.

The results of that query are then pruned, and some possibly sensitive data is removed (like the index name). The pruned data is then filtered using the sp_search_suggest_results filter before being sent back to the user. Use this filter to modify any aspects of the response you need.

Be mindful that search-time operations need to be as fast as possible, so try to avoid database operations and anything else that might be time-consuming during these filters. For instance, if you want to add data to the response, it's preferable to add fields to _source using the sp_search_suggest_query filter, to get it directly from Elasticsearch, rather than look up the data in the database by post ID using the sp_search_suggest_results filter.

Hooks Reference

Filter: sp_search_suggest_post_is_searchable

Filter if the post should be searchable using search suggest.

By default, this assumes that search suggest would be used on the frontend, so a post must meet the criteria to be considered "public". That is, its post type and post status must exist within the sp_searchable_post_types() and sp_searchable_post_statuses() arrays, respectively.

If you're using search suggest in the admin, you should either always return true for this filter so that private post types and statuses show in the suggestion results, or add a second search suggest index with permissions-based access.

Params:

  • bool $is_searchable Is this post searchable and thus should be added to the search suggest data?
  • array $data sp_post_pre_index data.
  • SP_Post $sp_post The SP_Post object.

Filter: sp_search_suggest_data

Filters the search suggest data (fields/content).

To filter any other characteristics of search suggest, use the sp_post_pre_index filter.

See https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-suggesters-completion.html

Params:

  • array $search_suggest_data Array of data for search suggesters completion. By default, this just includes the post_title.
  • array $data sp_post_pre_index data.
  • SP_Post $sp_post The SP_Post object.

Filter: sp_search_suggest_query

Filter the raw search suggest query.

Params:

  • array $request Search suggest query.

Filter: sp_search_suggest_results

Filter the raw search suggest options.

Params:

  • array $options Search suggest options.
  • array $results Search suggest raw results.
  • string $fragment Search fragment producing the results.

Filter: sp_enable_search_suggest

Checks if search suggestions are enabled. If true, adds the config to the mapping. If you'd like to edit it, use the sp_config_mapping filter.

Note that this will only work on ES 5.0 or later.

See https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-suggesters.html#completion-suggester

Params:

  • boolean $enabled Enabled if true, disabled if false. Defaults to false.