Skip to content

Apache Solr Setup Guide

Anthony Sena edited this page Nov 7, 2023 · 2 revisions

Apache Solr Setup Guide for OHDSI WebAPI

The following guide is used to setup and configure Apache Solr for use with the OHDSI WebAPI. Apache Solr is used to improve performance for the vocaublary search capabilities of WebAPI/Atlas.

Install the Solr Windows Service

We will use Apache Procrun to wrap Solr 8.11.2 in a Windows service to ensure we can control start up/shutdown like other services. To do this, follow the following steps

@echo off
set SERVICE_NAME=Solr
set SERVICE_HOME=E:\solr\solr-8.11.2
set PR_INSTALL=%SERVICE_HOME%\bin\prunsrv.exe

@REM Service Log Configuration
set PR_LOGPREFIX=%SERVICE_NAME%
set PR_LOGPATH=%SERVICE_HOME%\logs
set PR_STDOUTPUT=auto
set PR_STDERROR=auto
set PR_LOGLEVEL=Debug

set PR_STARTUP=auto
set PR_STARTMODE=exe
set PR_STARTIMAGE=%SERVICE_HOME%\bin\solr.cmd
set PR_STARTPARAMS=start

@REM Shutdown Configuration
set PR_STOPMODE=exe
set PR_STOPIMAGE=%SERVICE_HOME%\bin\solr.cmd
set PR_STOPPARAMS=stop;-all

%PR_INSTALL% //IS/%SERVICE_NAME% ^
  --Description="Apache Solr 8.11.2" ^
  --DisplayName="%SERVICE_NAME%" ^
  --Install="%PR_INSTALL%" ^
  --Startup="%PR_STARTUP%" ^
  --LogPath="%PR_LOGPATH%" ^
  --LogPrefix="%PR_LOGPREFIX%" ^
  --LogLevel="%PR_LOGLEVEL%" ^
  --StdOutput="%PR_STDOUTPUT%" ^
  --StdError="%PR_STDERROR%" ^
  --StartMode="%PR_STARTMODE%" ^
  --StartImage="%PR_STARTIMAGE%" ^
  ++StartParams="%PR_STARTPARAMS%" ^
  --StopMode="%PR_STOPMODE%" ^
  --StopImage="%PR_STOPIMAGE%" ^
  ++StopParams="%PR_STOPPARAMS%"

if not errorlevel 1 goto installed
echo Failed to install "%SERVICE_NAME%" service.  Refer to log in %PR_LOGPATH%
exit /B 1

:installed
echo The Service "%SERVICE_NAME%" has been installed
exit /B 0

NOTE: Adjust the SERVICE_HOME setting to match your install location.

Run a Windows Command Prompt in Administrator mode and then run E:\solr\solr-8.11.2\bin\service.bat. The service will be installed in the list of Windows Services as "Solr". Before moving forward, confirm the service is created but not running.

Creating the Solr core for WebAPI vocabulary search

NOTE: The name of the Solr core used must match the vocabulary version you plan to use in ATLAS & WebAPI with an underscore. For this example, the vocabulary version is "v5.0 17-JUN-19" and the corresponding folder name to hold this vocabulary is "v5.0_17-JUN-19". You should verify your vocabulary by running the following query on the CDM(s) you plan to use with WebAPI:

select vocabulary_version from vocabulary where vocabulary_id = 'None';

If your vocabulary version and core do not match, WebAPI will not find the Solr core and it will continue to use the DB when querying the vocabulary.

Solr core creation

  • Verify that the JAR files for your RDMBS are located in E:\solr\solr-8.11.2\server\lib otherwise you will face issues when attempting to build the SOLR core.
  • Created 2 directories for the core:
    • E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19
    • E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19\data
  • Copy the contents of WebAPI\src\main\resources\solr into E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19. Next make the following edits to the files in E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19:
    • data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary
    • core.properties: Edit the name to match the directory: v5.0_17-JUN-19
    • conf\solrconfig.xml: Edit the connection information in the <requestHandler name="/dataimport" class="solr.DataImportHandler"> block to provide it the details to connect the database holding your vocabulary.

Building the Solr core

Next, start up the Solr windows service and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in E:\solr\solr-8.11.2\server\logs.

  • From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the v5.0 17-JUN-19 core from the drop down, and then in the sub-menu that appears, I selected Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml
  • Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is: query:metformin

WebAPI Configuration

In the settings.xml for WebAPI, add the following XML to your profile:

  <solr.endpoint>http://localhost:8983/solr</solr.endpoint>

Recompile and deploy WebAPI.war. Once deployed, verify that the SOLR service is available for search by going to the endpoint WebAPI/info. The output will look similar to this:

{
	"version": "2.8.0",
	"buildInfo": {
		"artifactVersion": "WebAPI 2.8.0-SNAPSHOT",
		"build": "NA",
		"timestamp": "Thu Dec 12 16:53:22 UTC 2019"
	},
	"configuration": {
		"security": {
			"enabled": true
		},
		"vocabulary": {
			"cores": [
				"v5.0 17-JUN-19"
			],
			"solrEnabled": true
		},
		"person": {
			"viewDatesPermitted": false
		},
		"heracles": {
			"smallCellCount": "5"
		}
	}
}

Note in the JSON above, the vocabulary section shows the Solr is enabled and lists the core(s) that are available for search.

Install the Solr Service on RedHat (tested on v7)

Download the binary and install (sudo is needed)

cd /opt
# Download the binary (https://solr.apache.org/downloads.html) -- v8.11.1 tgz file

tar xzf solr-8.11.1.tgz solr-8.11.1/bin/install_solr_service.sh --strip-components=2
bash ./install_solr_service.sh solr-8.11.1.tgz

Solr core creation

  1. Verify that the JAR files for your RDMBS are located in /opt/solr/server/lib otherwise you will face issues when attempting to build the SOLR core.

  2. Create 2 directories for the core:

  • /var/solr/data/v5.0_12-FEB-21
  • /var/solr/data/v5.0_12-FEB-21/data
  1. Copy the contents of WebAPI/src/main/resources/solr into /var/solr/data/v5.0_12-FEB-21.

  2. Next make the following edits to the files in /var/solr/data/v5.0_12-FEB-21:

  • data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary
  • core.properties: Edit the name to match the directory: v5.0_12-FEB-21, remove the "conf\\" prefix from config and schema values.
  • conf\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary.
    • if using Spark, add this to this /dataimport block: <str name="autoCommit">true</str>

Building the Solr core

Next, start up the Solr service (systemctl start solr) and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in /var/solr/logs.

  • From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the v5.0_12-FEB-21 core from the drop down, and then in the sub-menu that appears, I selected Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml
  • Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin

Then, follow the rest of the WebAPI instructions in the section above

Install SOLR using Docker

On Windows/Mac:

Download Docker desktop app (https://www.docker.com/products/docker-desktop).

On Debian/Ubuntu Linux:

sudo apt-get install docker

On RHEL:

sudo yum install docker

Solr core creation

  1. Create a directory to store the Dockerfile and configuration files, this directory is known as the docker build context folder. Navigate to it in your command line interface.

  2. Create a file named "Dockerfile" in this build context folder and use this as the content:

FROM solr:8.11.1

# argument variables to define
ARG vocabulary_version
ARG jdbc_file_name

# copy the solr configset from WebAPI
COPY --chown=solr /WebAPI/src/main/resources/solr /var/solr/data/$vocabulary_version

# copy your JDBC file
COPY --chown=solr $jdbc_file_name /opt/solr-8.11.1/server/lib/$jdbc_file_name
  1. Clone the WebAPI Git repo (using your desired commit or release) into the docker build context folder:
git clone https://github.com/OHDSI/WebAPI.git
  1. Copy your JDBC jar file (for connecting to your vocabulary's database platform) into the build context folder.

  2. Next make the following edits to the files in /WebAPI/src/main/resources/solr:

  • data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary
  • core.properties: Edit the name to match the vocabulary version (e.g. v5.0_20-MAY-21), remove the "conf\\" prefix from config and schema values.
  • conf\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary.
    • if using Spark, add this to this /dataimport block: <str name="autoCommit">true</str>

Build/create/start the docker container

  1. Run the container build step, specifying the vocabulary version and the name of the JDBC file needed for connecting to your database platform:
docker build --build-arg vocabulary_version=v5.0_20-MAY-21 --build-arg jdbc_file_name=SparkJDBC41.jar --no-cache -t solr .
  1. Run the create step, which will then create the container:
docker create --restart=always --name=solr -p 8983:8983 -t solr
  1. Start the container:
docker start solr
  1. Test the container has started by going to http://localhost:8983/solr/#/ (substitute the server name for localhost)

Building the Solr core

  • From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the vocabulary core from the drop down, and then in the sub-menu that appears, select Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml
  • Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin

Then, follow the rest of the WebAPI configuration instructions in the section above.