Using docker-compose
, bring up a Senzing stack
using RabbitMQ and a PostgreSQL database.
This repository illustrates a reference implementation of Senzing using RabbitMQ as the queue and PostgreSQL as the underlying database.
The instructions show how to set up a system that:
- Reads JSON lines from a file on the internet and sends each JSON line to a message queue via the Senzing
stream-producer.
- In this implementation, the queue is RabbitMQ.
- Reads messages from the queue and inserts into Senzing via the Senzing
stream-loader.
- In this implementation, Senzing keeps its data in a PostgreSQL database.
- Reads information from Senzing via Senzing API Server server.
- Views resolved entities in a web app.
The following diagram shows the relationship of the docker containers in this docker composition. Arrows represent data flow.
- Preamble
- Related artifacts
- Expectations
- Prerequisites
- Demonstrate
- Cleanup
- Advanced
- Errors
- References
At Senzing, we strive to create GitHub documentation in a "don't make me think" style. For the most part, instructions are copy and paste. Whenever thinking is needed, it's marked with a "thinking" icon 🤔. Whenever customization is needed, it's marked with a "pencil" icon ✏️. If the instructions are not clear, please let us know by opening a new Documentation issue describing where we can improve. Now on with the show...
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
- Space: This repository and demonstration require 7 GB free disk space.
- Time: Budget 2 hours to get the demonstration up-and-running, depending on CPU and network speeds.
- Background knowledge: This repository assumes a working knowledge of:
The Git repository has files that will be used in the docker-compose
command.
-
Using these environment variable values:
export GIT_ACCOUNT=senzing export GIT_REPOSITORY=docker-compose-demo export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
-
Follow steps in clone-repository to install the Git repository.
-
✏️ Specify the directory where Senzing should be installed on the local host. Example:
export SENZING_VOLUME=~/my-senzing
⚠️ macOS - File sharing must be enabled forSENZING_VOLUME
.⚠️ Windows - File sharing must be enabled forSENZING_VOLUME
.
-
Identify directories on the local host. Example:
export SENZING_DATA_DIR=${SENZING_VOLUME}/data export SENZING_DATA_VERSION_DIR=${SENZING_DATA_DIR}/2.0.0 export SENZING_ETC_DIR=${SENZING_VOLUME}/etc export SENZING_G2_DIR=${SENZING_VOLUME}/g2 export SENZING_VAR_DIR=${SENZING_VOLUME}/var export POSTGRES_DIR=${SENZING_VAR_DIR}/postgres export RABBITMQ_DIR=${SENZING_VAR_DIR}/rabbitmq
-
Create directory for RabbitMQ persistence. Note: Although the
RABBITMQ_DIR
directory will have open permissions, the directories created withinRABBITMQ_DIR
will be restricted. Example:sudo mkdir -p ${RABBITMQ_DIR} sudo chmod 770 ${RABBITMQ_DIR}
🤔 Optional:
If you do not plan on using the senzing/sshd container then these ssh sections can be ignored.
Normally port 22 is already in use for ssh
.
So a different port may be needed by the running docker container.
-
🤔 See if port 22 is already in use. If it is not in use, the next 2 steps are optional. Example:
sudo lsof -i -P -n | grep LISTEN | grep :22
-
✏️ Choose port for docker container. Example:
export SENZING_SSHD_PORT=9181
-
Construct parameter for
docker run
. Example:export SENZING_SSHD_PORT_PARAMETER="--publish ${SENZING_SSHD_PORT:-9181}:22"
🤔 Optional: The default password set for the sshd containers is senzingsshdpassword
.
However, this can be changed.
-
✏️ Set the
SENZING_SSHD_PASSWORD
variable to change the password to access the sshd container. Example:export SENZING_SSHD_PASSWORD=<Pass_You_Want>
To use the Senzing code, you must agree to the End User License Agreement (EULA).
-
⚠️ This step is intentionally tricky and not simply copy/paste. This ensures that you make a conscious effort to accept the EULA. Example:export SENZING_ACCEPT_EULA="<the value from this link>"
"latest" or "pinned" versions of containers can be used in the docker-compose formation.
The following will be used to pull the pinned or most recent latest
versions.
-
🤔 Optional: Pin versions of docker images by setting environment variables. Example:
source <(curl -X GET https://raw.githubusercontent.com/Senzing/knowledge-base/master/lists/docker-versions-latest.sh)
-
Pull docker images. Example:
cd ${GIT_REPOSITORY_DIR} sudo \ --preserve-env \ docker-compose --file resources/senzing/docker-compose-senzing-installation.yaml pull sudo \ --preserve-env \ docker-compose --file resources/postgresql/docker-compose-rabbitmq-postgresql.yaml pull
-
If Senzing has not been installed, install Senzing. Example:
cd ${GIT_REPOSITORY_DIR} sudo \ --preserve-env \ docker-compose --file resources/senzing/docker-compose-senzing-installation.yaml up
- This will download and extract a 3GB file. It may take 5-15 minutes, depending on network speeds.
Senzing comes with a trial license that supports 100,000 records.
- 🤔 Optional: If more than 100,000 records are desired, see Senzing license.
-
Launch docker-compose formation. Example:
cd ${GIT_REPOSITORY_DIR} sudo \ --preserve-env \ docker-compose --file resources/postgresql/docker-compose-rabbitmq-postgresql.yaml up
-
Allow time for the components to come up and initialize.
- There will be errors in some docker logs as they wait for dependent services to become available.
docker-compose
isn't the best at orchestrating docker container dependencies.
- There will be errors in some docker logs as they wait for dependent services to become available.
Once the docker-compose formation is running, different aspects of the formation can be viewed.
Username and password for the following sites were either passed in as environment variables or are the default values seen in docker-compose-rabbitmq-postgresql.yaml.
- A good tool to monitor individual docker logs is Portainer. When running, Portainer is viewable at localhost:9170.
Instructions to use the senzing/sshd container are viewable in the senzing/docker-sshd repository
- RabbitMQ is viewable at
localhost:15672.
- Defaults: username:
user
password:bitnami
- Defaults: username:
- See additional tips for working with RabbitMQ.
- PostgreSQL is viewable at
localhost:9171.
- Defaults: username:
postgres
password:postgres
- Defaults: username:
- See additional tips for working with PostgreSQL.
View results from Senzing REST API server. The server supports the Senzing REST API.
- OpenApi Editor is viewable at localhost:9180.
- Example Senzing REST API request: localhost:8250/heartbeat
- See additional tips for working with Senzing API server.
- Senzing Entity Search WebApp is viewable at localhost:8251.
- See additional tips for working with Senzing Entity Search WebApp.
-
Change file permissions on PostgreSQL database. Example:
sudo chmod 777 -R ${SENZING_VAR_DIR}/postgres
-
Jupyter Notebooks are viewable at localhost:9178.
-
See additional tips for working with Jupyter Notebooks.
The web-based Senzing X-term can be used to run Senzing command-line programs.
- Senzing X-term is viewable at localhost:8254.
- See additional tips for working with Senzing X-Term.
When the docker-compose formation is no longer needed, it can be brought down and directories can be deleted.
-
Bring down docker formation. Example:
cd ${GIT_REPOSITORY_DIR} sudo docker-compose --file resources/senzing/docker-compose-senzing-installation.yaml down sudo docker-compose --file resources/postgresql/docker-compose-rabbitmq-postgresql.yaml down sudo docker-compose --file resources/postgresql/docker-compose-rabbitmq-postgresql-again.yaml down
-
Remove directories from host system. The following directories were created during the demonstration:
${SENZING_VOLUME}
${GIT_REPOSITORY_DIR}
They may be safely deleted.
The following topics discuss variations to the basic docker-compose demonstration.
🤔 Optional: After the launch and shutdown of the original docker formation, the docker formation can be brought up again without requiring initialization steps. The following shows how to bring up the prior docker formation again without initialization.
-
Launch docker-compose formation. Example:
cd ${GIT_REPOSITORY_DIR} sudo \ --preserve-env \ docker-compose --file resources/postgresql/docker-compose-rabbitmq-postgresql-again.yaml up
This docker formation brings up the following docker containers:
- bitnami/rabbitmq
- dockage/phppgadmin
- postgres
- senzing/console
- senzing/entity-web-search-app
- senzing/init-container
- senzing/jupyter
- senzing/redoer
- senzing/senzing-api-server
- senzing/stream-loader
- senzing/stream-producer
Configuration values specified by environment variable or command line parameter.
- POSTGRES_DB
- POSTGRES_DIR
- POSTGRES_PASSWORD
- POSTGRES_USERNAME
- RABBITMQ_DIR
- RABBITMQ_PASSWORD
- RABBITMQ_USERNAME
- SENZING_ACCEPT_EULA
- SENZING_DATA_DIR
- SENZING_DATA_SOURCE
- SENZING_DATA_VERSION_DIR
- SENZING_ENTITY_TYPE
- SENZING_ETC_DIR
- SENZING_G2_DIR
- SENZING_VAR_DIR
- See docs/errors.md.