If you are beginning your journey with Senzing, please start with Senzing Quick Start guides.
You are in the Senzing Garage where projects are "tinkered" on. Although this GitHub repository may help you understand an approach to using Senzing, it's not considered to be "production ready" and is not considered to be part of the Senzing product. Heck, it may not even be appropriate for your application of Senzing!
The Neo4j connector is an application, written in Java, which gathers information from Senzing and maps it into Neo4j graph database. The connector reads messages containing Senzing information from a message queue (RabbitMQ or AWS SQS), derives from that data what entities in the Senzing repository are affected, gets the entity data, using the Senzing API, finds how the entities relate to other entities and inserts that data into a Neo4j database. Note that this connector does not load source records into the Neo4j database. It loads the Senzing entity information and each entity can be constructed from multiple source records. If the source record data is desired, and how it relates to the Senzing entities, it will need to be loaded into the database prior to loading the Senzing entities. In that case the records need to contain DATA_SOURCE and RECORD_ID fields, matching those used in Senzing repository for linking the Senzing entities back to source system records.
The messages read from the message queue are in json format and an example looks like this:
{"DATA_SOURCE":"TEST","RECORD_ID":"RECORD3","AFFECTED_ENTITIES":[{"ENTITY_ID":1,"LENS_CODE":"DEFAULT"}]}
This project gives the framework for mapping Senzing data to Neo4j database but can be modified to fit the user's specific solutions.
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
To build the Neo4j Connector you will need Apache Maven (recommend version 3.6.1 or later) as well as OpenJDK version 11.0.x (recommend version 11.0.6+10 or later).
This application interacts with Senzing API so it needs to be installed beforehand. Information on how to install it can be found here: Senzing API quick start
-
Setup your environment. The Connector relies on native libraries and the environment must be properly setup to find those libraries:
-
Linux
export SENZING_G2_DIR=/opt/senzing/g2 export LD_LIBRARY_PATH=${SENZING_G2_DIR}/lib:${SENZING_G2_DIR}/lib/debian:$LD_LIBRARY_PATH
-
Windows
set SENZING_G2_DIR="C:\Program Files\Senzing\g2" set Path=%SENZING_G2_DIR%\lib;%Path%
-
To build connector-neo4j:
git clone [email protected]:Senzing/connector-neo4j.git
cd connector-neo4j
mvn install
The JAR file will be contained in the target
directory under the name neo4j-connector-[version].jar
.
Where [version]
is the version number from the pom.xml
file.
In addition target/libs
will contain all the depending jar files needed by the application
and target/conf/neo4jconnector.properties
holds the configuration needed by the application and
it will require modifications to match the installation of g2 and other applications the Connector depends on. SEE BELOW.
The Connector requires installations of Senzing API (see above), RabbitMQ and Neo4j for its operation.
Note: if docker containers are used it is best to use a docker network to facilitate communication between the containers. An example for setting up a network:
sudo docker network create -d bridge ncn
This network "ncn" will be used when dealing with containers in this write-up.
-
Installing G2
If not done already. See Dependencies above.
-
Install Neo4j
An easy way to install and run Neo4j is to run it as a docker container
sudo sudo docker run --detach \ --publish=7474:7474 \ --publish=7687:7687 \ --volume=$HOME/neo4j/data:/data \ --volume=$HOME/neo4j/logs:/logs \ --network ncn \ neo4j:latest
Other ways to install and run Neo4j can be found here: Neo4j Installation.
Once the installation is done go to
http://<server name>:7474
, using a browser. If the installation is local that would behttp://locahlost:7474
. Log in using default user name and password, which are neo4j/neo4j. You will be asked to change your password. Do so and remember the password since you will need it for theEdit configuration
section below. -
Install RabbitMQ
Again, run it as a docker container is a simple option
sudo docker run -it --rm --name rabbitmq \ --publish 5672:5672 \ --publish 15672:15672 \ --network ncn \ rabbitmq:3-management
If using an installer is preferred please see Downloading and Installing RabbitMQ.
-
🤔 Optional: Create a queue in RabbitMQ
The Connector will create the queue specified in configuration if it doesn't exist already. If having a queue created beforehand is desired, here are the steps:
- Open up a browser and enter
http://<host name>:15672
into the address bar. If you install locally this will behttp://localhost:14562
- Log in. Default is guest/guest on a fresh install.
- Select
Queues
tab at the top. - Click
Add a new queue
below the grid. - Enter
senzing
in theName
box. - For the
Durability
option, click the pull-down and seletTransient
. - Click
Add Queue
button at the bottom.
- Open up a browser and enter
-
✏️ Edit configuration
There are two ways to pass configuration to the connector. Through a configuration file and with command line parameters.
Lets first look at the configuration file. The configuration file is found at
target/conf/neo4jconnector.properties
. The steps to set it up follow.- Locate the G2 ini file. It can generally be found in the project path as
/home/<user>/senzing/etc/G2Module.ini
whereuser
is the user account. See the Quick Start Guide for further information. - Open
target/conf/neo4jconnector.properties
in an editor. - Change the value of
neo4jconnector.g2.inifile
to what was found in step 1. above. - Change the
neo4jPassword
forneo4jconnector.neo4j.uri
to the password you created inInstall Neo4j
section above. - Make any other changes needed. For example if RabbitMQ was set up with user security then user name and password need to be set in the file.
The command line takes following options:
- Locate the G2 ini file. It can generally be found in the project path as
-iniFile
path to the G2 ini file
-neo4jConnection
connection string for neo4j, the format is `bolt://<user>:<password>@<hostname>:<port>`
-mqHost
host name or ip address for RabbitMQ server
-mqUser
RabbitMQ user name
-mqPassword
Password for RabbitMQ
-mqQueue
The name of the RabbitMQ queue used for receiving messages
If both configuration file and command line options are used the command line options take precedence.
To execute the server you will use java -jar
. It is assumed that your environment
is properly configured as described in the "Dependencies" and "Preparation for running" sections above.
Type
java -jar neo4j-connector-[version].jar
Where [version]
is the version number from the pom.xml
file.
If command line options are used it could look like this:
java -jar neo4j-connector-[version].jar \
-iniFile /home/user/senzing/etc/G2Module.ini \
-neo4jConnection bolt://neo4j:neo4jPassword@localhost:7687 \
-mqHost localhost \
-mqQueue senzing
This repository and demonstration require 6 GB free disk space.
Budget 40 minutes to get the demonstration up-and-running, depending on CPU and network speeds.
This repository assumes a working knowledge of:
Configuration values specified by environment variable or command line parameter.
The following software programs need to be installed:
For more information on environment variables, see Environment Variables.
-
Set these environment variable values:
export GIT_ACCOUNT=senzing export GIT_REPOSITORY=connector-neo4j export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
-
Follow steps in clone-repository to install the Git repository.
-
Build docker image.
cd ${GIT_REPOSITORY_DIR} sudo make docker-build
Note:
sudo make docker-build-development-cache
can be used to create cached docker layers.
-
Prepare for running.
Ensure the steps in Preparation for running have been executed before running the docker container.
-
Run docker container.
⚠️ macOS - File sharing must be enabled for the volumes.⚠️ Windows - File sharing must be enabled for the volumes.
When running the docker container the command line options need to be used.
Example:
sudo docker run \ --network ncn \ senzing/connector-neo4j \ -iniFile /home/user/senzing/etc/G2Module.ini \ -neo4jConnection bolt://neo4j:neo4jPassword@localhost:7687 \ -mqHost localhost \ -mqQueue senzing