Skip to content

Get Elasticsearch up and running

Schett Nico edited this page Jan 12, 2021 · 6 revisions

Get Elasticsearch up and running

To take Elasticsearch for a test drive, you can create a hosted deployment on the Elasticsearch Service or set up a multi-node Elasticsearch cluster on your own Linux, macOS, or Windows machine.

  • Run Elasticsearch on Elastic Cloud
  • Run Elasticsearch locally on Linux, macOS, or Windows
  • Run Elasticsearch in container

The following guide refers to the best practices container implementation.

Blank diagram-15

Compose

Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services.

Using Compose is basically a three-step process:

  1. Define your app’s environment with a Dockerfile so it can be reproduced anywhere.
  2. Define the services that make up your app in docker-compose.yml so they can be run together in an isolated environment.
  3. Run docker-compose up and Compose starts and runs your entire app.

The docker-compose.yml for Elasticsearch integration in Django looks like this:

version: '3.1'

services:

#> Elasticsearch
    es01:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.10.1
        container_name: es01
        environment: 
            - node.name=es01
            - discovery.type=single-node
            - action.auto_create_index=true
        ports:
            - "9200:9200"
            - "9300:9300"

#> PostgreSQL
    db:
        image: postgres:12-alpine
        environment:
            - POSTGRES_DB=app_db
            - POSTGRES_USER=app_user
            - POSTGRES_PASSWORD=changeme
        restart: always

#> Django
    app:
        build: .
        environment:
            - "DJANGO_SECRET_KEY=changeme"
            - "DATABASE_URL=postgres://app_user:changeme@db/app_db"
        links:
            - "db:db"
        ports:
            - "50000:8000/tcp"
        depends_on:
            - "db"
            - "es01"

# SPDX-License-Identifier: (EUPL-1.2)
# Copyright © 2021 Nico Schett

The service es01 defines a single node cluster. The implementation of a multi-node Elasticsearch cluster is described at elastic.co.

Configure Django Settings

Ohrwurm is based on Wagtail CMS and includes Wagtailsearch. Wagtailsearch has support for multiple backends, giving you the choice between using the database for search or an external service such as Elasticsearch. The database backend is enabled by default.

When switching to Elasticsearch as search backend, the settings.py must be applied accordingly:

# > Search Configuration
# https://docs.wagtail.io/en/latest/topics/search/backends.html
WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'wagtail.search.backends.elasticsearch7',
        'URLS': ['http://es01:9200'],
        'INDEX': 'esite',
        'TIMEOUT': 5,
        'OPTIONS': {},
        'INDEX_SETTINGS': {},
    }
}
# > Activate Search for Bifrost API
BIFROST_ADD_SEARCH_HIT = True

Using es01:9200 is only possible in production. In development localhost:9200 (according to the configuration) must be used instead.

Django Model Indexing

To make a model searchable, you’ll need to add it into the search index. All pages, images and documents are indexed for you, so you can start searching them right away. When using a custom model instead of Wagtail builtins (Page) indexing is possible through the inheritance of index.Indexed.

class ProjectAudioChannel(index.Indexed, ClusterableModel, TimeStampMixin):
    title = models.CharField(null=True, blank=False, max_length=250)
    description = models.TextField(null=True, blank=True)
    channel_id = models.CharField(null=True, blank=True, max_length=250)
    avatar_image = models.ForeignKey(
        settings.WAGTAILIMAGES_IMAGE_MODEL,
        null=True,
        blank=True,
        related_name="+",
        on_delete=models.SET_NULL,
    )
    members = ParentalManyToManyField(
        get_user_model(), related_name="pacs", null=True, blank=True
    )

    search_fields = [
        index.SearchField("title"),
        index.SearchField("created_at"),
        index.SearchField("description"),
        index.FilterField('snekuser_id'),
    ]

    ...
    
    def snekuser_id(self):
        """
        Adds all of our Members' snekuser_ids to the search filter list.
        Ref: https://www.thetopsites.net/article/58271827.shtml
        """
        return list(self.members.all().values_list('id', flat=True))

Sync index with database (Elasticsearch - Django/Postgres)

By default, Wagtail will automatically keep all indexes up to date. This could impact performance when editing content, especially if your index is hosted on an external service.

If you have disabled auto update, you must run the update_index command on a regular basis to keep the index in sync with the database.

 /venv/bin/python manage.py update_index

Search

Accessing the search is possible with two different methods.

  1. Searching trough Bifrost API. http://localhost:8000/graphq': image image

  2. Searching directly on Elasticsearch node. http://localhost:9200: image image

git checkout 028-add-basic-elasticsearch

Credits: elastic doc, wagtail doc