Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Docker files to run self-contained image for trying out Skosmos #962

Merged
merged 30 commits into from
Mar 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c1ea002
Add Docker files to run self-contained image for trying out Skosmos
kinow Mar 25, 2020
ab75015
Use a name different than config.ttl for Docker config.ttl file (avoi…
kinow Apr 7, 2020
0341ee0
Use localhost:9030 instead of 3030 to avoid conflicts in the host mac…
kinow May 1, 2020
670ba6e
Add PHP APC module
kinow May 1, 2020
06bc46b
Use tdb2 in example command to create dataset as in the tutorial
kinow May 1, 2020
380f57c
Create skosmos dataset automatically
kinow May 1, 2020
bfa048a
Use UTC in PHP ini settings
kinow May 1, 2020
4a05865
Fix Codacy issues
kinow May 2, 2020
cc95b85
Add Unesco and SWT Thesauri
kinow May 2, 2020
7d39bc9
Use Jena Text
kinow May 2, 2020
53daa92
Replace existing Dockerfile and docker-compose files
kinow May 8, 2020
21614ec
Add varnish cache in a separate container
kinow May 9, 2020
7934a7a
Remove license from Docker README
kinow Mar 2, 2021
19d6f90
Update list of languages
kinow Mar 3, 2021
f1a813d
Remove unnecessary step that created a dataset (can simply upload data)
kinow Mar 3, 2021
cd23eb4
Remove commented part about installing Skosmos from tar.gz archive
kinow Mar 3, 2021
504880a
Install from git repository instead of targz file
kinow Mar 3, 2021
6f6d786
Add labels, skosmos version, healthcheck, and expose port
kinow Mar 3, 2021
9fb69d1
Fix directories in Docker configuration (matching what we already hav…
kinow Mar 3, 2021
2adf286
Enable language dropdown
kinow Mar 3, 2021
b69385c
Correct host (not container) hostname of fuseki URL
kinow Mar 3, 2021
4f87bd1
Use COPY to copy the checked out git code, instead of installing from…
kinow Mar 3, 2021
6e501b3
Use localhost for Docker example, avoid using Finto endpoint, and use…
kinow Mar 4, 2021
8238e3d
Remove comment from Docker Compose file about loading data
kinow Mar 4, 2021
d1fdb67
Update dockerfiles README.md, with new instructions for Docker and
kinow Mar 4, 2021
f94b074
Use markdown indented code block syntax
kinow Mar 10, 2021
ec2d6fc
Document that users must be in the dockerfiles/ directory when runnin…
kinow Mar 10, 2021
c99d8bc
Add note about the Fuseki URL for docker-compose
kinow Mar 10, 2021
e9446b9
Update to PHP 7.4 and Ubuntu 20.04
kinow Mar 10, 2021
10323ad
Add sentence explaining the --net=host and port 80 use
kinow Mar 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 0 additions & 11 deletions Dockerfile

This file was deleted.

30 changes: 0 additions & 30 deletions docker-compose.yml

This file was deleted.

79 changes: 79 additions & 0 deletions dockerfiles/Dockerfile.ubuntu
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
FROM ubuntu:20.04

LABEL maintainer="National Library of Finland"
LABEL version="0.1"
LABEL description="A Docker image for Skosmos with Apache httpd."

ARG SKOSMOS_VERSION=v.2.9

ARG DEBIAN_FRONTEND=noninteractive

# git is necessary for some composer packages e.g. davidstutz/bootstrap-multiselect
# gettext is necessary as php-gettext was available in 18.04, but not in 20.04
RUN apt-get update && apt-get install -y \
osma marked this conversation as resolved.
Show resolved Hide resolved
apache2 \
curl \
gettext \
git \
libapache2-mod-php7.4 \
locales \
php7.4 \
php7.4-curl \
php7.4-xsl \
php7.4-intl \
php7.4-mbstring \
php-apcu \
php-zip \
unzip \
&& rm -rf /var/lib/apt/lists/*

# https://stackoverflow.com/a/28406007
# fixes warnings like perl: warning: Setting locale failed.
RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \
locale-gen \
ar_AE.utf8 \
da_DK.utf8 \
de_DE.utf8 \
en_GB.utf8 \
en_US.utf8 \
es_ES.utf8 \
fa_IR.utf8 \
fi_FI.utf8 \
fr_FR.utf8 \
it_IT.utf8 \
nb_NO.utf8 \
nl_NL.utf8 \
nn_NO.utf8 \
pl_PL.utf8 \
pt_PT.utf8 \
pt_BR.utf8 \
ru_RU.utf8 \
sv_SE.utf8 \
zh_CN.utf8
ENV LANGUAGE=en_US:en
ENV LC_ALL=en_US.UTF-8
ENV LANG=en_US.UTF-8

# timezone
RUN sed -i 's/;date.timezone =/date.timezone = "UTC"/g' /etc/php/7.4/apache2/php.ini

COPY dockerfiles/config/000-default.conf /etc/apache2/sites-available/000-default.conf

RUN a2enmod rewrite
RUN a2enmod expires

RUN echo "ServerName localhost" >> /etc/apache2/apache2.conf

WORKDIR /var/www/html
RUN rm index.html

COPY . /var/www/html

# Configure Skosmos
COPY dockerfiles/config/config-docker.ttl /var/www/html/config.ttl

HEALTHCHECK --interval=5s --timeout=3s --retries=3 CMD curl -f http://localhost || exit 1

EXPOSE 80

CMD ["/usr/sbin/apache2ctl", "-D", "FOREGROUND"]
113 changes: 113 additions & 0 deletions dockerfiles/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
Dockerfiles for Skosmos.

## Running with Docker

The following commands will build and tag the image it with `skosmos:test`,
and run the container. The container name is `skosmos-web`, but you can customize
the name, and other flags as necessary. The container will listen to port
`80` at both the host and container since we are using `--net=host` to allow the
Skosmos application to access Fuseki at `http://localhost:3030`. You are free to
modify the command line and the `Dockerfile.ubuntu` and configuration files if you
would like to deploy it differently.

# NOTE: the container copies the project sources during build, so the
# context must be the parent directory, i.e. you MUST build the image
# from the Skosmos source directory, not from $sources/dockerfiles/
docker build -t skosmos:test . -f dockerfiles/Dockerfile.ubuntu
docker run -d --rm --name skosmos-web --net=host skosmos:test

Now Skosmos should be available at `http://localhost/`. See the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using port 80 on the host is a bit awkward if you already have, say, Apache listening there (which is normally the case for me). Wasn't there a -p option for the docker run in an earlier version of this README that mapped the container port 80 to something else? I think it would be better to expose this as e.g. port 9090, as is done in the docker-compose setup.

Copy link
Collaborator Author

@kinow kinow Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but not if you want the container to access host URLs like localhost:3030 (localhost being the host not container with skosmos)

There are alternatives though

  • use a value like fuseki:3030 and ask user to link containers or create a network (I think if we used fuseki:3030, that'd also mean we can unify the dockerfiles/config/config-*.ttl files into a single one)
  • include fuseki in the image (would need 2 Dockerfiles, so docker compose or AWS etc can deploy separately)
  • there is another docker image that does this kind of host-container port mapping (from host to container). I haven't used it, but it'd be 1 more image for the standalone container

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see - you have to use --net=host to make the container share the host network (so that it can access Fuseki on localhost) instead of sitting on a separate bridged network. So the Apache port inside the container (80) has to match what is seen on the outside, there is no room for a mapping layer.

I think this is OK, as long as it's documented - the README has to say that you can't have anything else listening on port 80 (such as Apache).

It should also be possible to switch the Apache in the container to another port by changing the Listen directive, but maybe that's not a great idea since 80 is the expected, standard port number.

I did find a discussion about accessing services on the host which mentioned the possibility of using host.docker.internal but apparently this only works on macOS and Windows, and the most recent versions of Docker on Linux (version 20.10.0 released 2020-12-08 included moby/moby#40007). So that could be an option as well, but requires a fairly recent Docker installation - the versions that come pre-packaged with distros are usually older.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see - you have to use --net=host to make the container share the host network (so that it can access Fuseki on localhost) instead of sitting on a separate bridged network. So the Apache port inside the container (80) has to match what is seen on the outside, there is no room for a mapping layer.

Exactly.

I think this is OK, as long as it's documented - the README has to say that you can't have anything else listening on port 80 (such as Apache).

Added the following.

diff --git a/dockerfiles/README.md b/dockerfiles/README.md
index b09a1705..8d0e0ed5 100644
--- a/dockerfiles/README.md
+++ b/dockerfiles/README.md
@@ -4,7 +4,11 @@ Dockerfiles for Skosmos.
 
 The following commands will build and tag the image it with `skosmos:test`,
 and run the container. The container name is `skosmos-web`, but you can customize
-the name, port, and other flags as necessary.
+the name, and other flags as necessary. The container will listen to port
+`80` at both the host and container since we are using `--net=host` to allow the
+Skosmos application to access Fuseki at `http://localhost:3030`. You are free to
+modify the command line and the `Dockerfile.ubuntu` and configuration files if you
+would like to deploy it differently.
 
     # NOTE: the container copies the project sources during build, so the
     # context must be the parent directory, i.e. you MUST build the image

It should also be possible to switch the Apache in the container to another port by changing the Listen directive, but maybe that's not a great idea since 80 is the expected, standard port number.

I think anyone that wants to deploy Skosmos to AWS/Google/Azure/etc should be able to understand what we did here, and modify the Dockerfile and configuration files to match their network/volume/processing requirements.

I did find a discussion about accessing services on the host which mentioned the possibility of using host.docker.internal but apparently this only works on macOS and Windows, and the most recent versions of Docker on Linux (version 20.10.0 released 2020-12-08 included moby/moby#40007). So that could be an option as well, but requires a fairly recent Docker installation - the versions that come pre-packaged with distros are usually older.

I think we went down the same rabbit hole here 😀 I tried the host.docker.internal but that didn't work, then found an issue on github to add it to Linux Docker. We can review it later after new versions of Docker are released, or if a user or dev has feedback on how to improve it I think.

Thanks @osma!
Bruno

[section below](#loading-vocabulary-data) to load vocabulary data.

**NOTE**: the Skosmos instance configured in this example setup expects the Fuseki
backend to support the "JenaText" dialect, to have the dataset "skosmos" created
with the vocabulary data, and to be available at `http://localhost:3030`.
For this last requisite you must create a
[Docker network](https://docs.docker.com/network/network-tutorial-standalone/),
use [`--net=host`](https://docs.docker.com/network/host/) or other mechanisms for
that. See the section [Running with docker-compose](#running-with-docker-compose)
if you would like to use Docker Compose.

To stop the container:

docker stop skosmos-web

The container created is based on the project
[Install Tutorial](https://github.com/NatLibFi/Skosmos/wiki/InstallTutorial).
So it will create a container with Ubuntu, Apache2, PHP, composer, and a version
of Skosmos.

The Apache virtual host configuration is located at `config/000-default.conf`. And
the configuration file used for Skosmos is at `config/config.ttl`. Customize these
two files as necessary.

**NOTE**: If you would like to start a Fuseki container to test with Docker only,
without Docker Compose, you can try the following command before loading your
vocabulary data. It starts a container in the same way our other example with
the `docker-compose` command.

docker run --name fuseki -ti --rm \
--env "ADMIN_PASSWORD=admin" --env "JVM_ARGS=-Xmx2g" \
-p 3030:3030 \
--mount type=bind,src=$(pwd)/config/skosmos.ttl,dst=/fuseki/configuration/skosmos.ttl \
stain/jena-fuseki

## Running with docker-compose

The `docker-compose` provided configuration will prepare three containers.
The first one called `skosmos-fuseki`, which uses the `stain/jena-fuseki`
image for Jena, and starts a container with 2 GB of memory and `admin` as
the user and password. The `docker-compose` service name of this container
is `fuseki`.

The second container is the `fuseki-cache`, a Varnish Cache container. It sits
between the `skosmos-fuseki` and the `skosmos-web` (more on this below). The
Varnish Cache container is pre-configured to intercept queries to `fuseki:3030`
keeping the results `gzipped` in the cache for one week.

The last container created is `skosmos-web`, using the same image mentioned
in the [previous section](#running-with-docker). The only difference being
that we bind a new Skosmos configuration `config/config-docker-compose.ttl`
on `/var/www/html/config.ttl`.

This `config-docker-compose.ttl` file uses `http://fuseki-cache:80/skosmos/sparql`
as `skosmos:sparqlEndpoint`, forcing `skosmos-web` to go through the `fuseki-cache`
for a better performance. You can customize this example setup to start Skosmos
pointing to any other existing Apache Jena server, preferably with the Jena Text
extension.

**NOTE**: `fuseki:3030` and `fuseki-cache:80` are from the internal Docker network.
To the host machine Docker Compose is exposing these values as `localhost:3030`
and `localhost:9031` respectively.

To create the containers in this example setup, you can use this command
from the `./dockerfiles/` directory:

docker-compose up -d

Now Skosmos should be available at `http://localhost:9090/` from your
host. See the [section below](#loading-vocabulary-data) to load vocabulary data.

To stop:

docker-compose down

## Loading vocabulary data

After you have your container running, with either Docker or `docker-compose`,
kinow marked this conversation as resolved.
Show resolved Hide resolved
you will need to load your vocabulary data.

**NOTE**: In the example below, we use the Fuseki URL `localhost:3030`, which
should work for the Docker setup. If you used `docker-compose`, you will have
to use `localhost:9030` instead.

# load STW vocabulary data
curl -L -o stw.ttl.zip http://zbw.eu/stw/version/latest/download/stw.ttl.zip
unzip stw.ttl.zip
curl -I -X POST -H Content-Type:text/turtle -T stw.ttl -G http://localhost:3030/skosmos/data --data-urlencode graph=http://zbw.eu/stw/
# load UNESCO vocabulary data
curl -L -o unescothes.ttl http://skos.um.es/unescothes/unescothes.ttl
curl -I -X POST -H Content-Type:text/turtle -T unescothes.ttl -G http://localhost:3030/skosmos/data --data-urlencode graph=http://skos.um.es/unescothes/

After you execute these commands successfully, you should be able to use all the
features of Skosmos, such as browsing vocabularies and concepts.
15 changes: 15 additions & 0 deletions dockerfiles/config/000-default.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<VirtualHost *:80>
#ServerName www.example.com
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
#LogLevel info ssl:warn
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
ErrorLog /proc/self/fd/2
CustomLog /proc/self/fd/1 combined
<Directory /var/www/html>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
allow from all
</Directory>
</VirtualHost>
110 changes: 110 additions & 0 deletions dockerfiles/config/config-docker-compose.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
@prefix void: <http://rdfs.org/ns/void#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix wv: <http://vocab.org/waiver/terms/norms> .
@prefix sd: <http://www.w3.org/ns/sparql-service-description#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix skosmos: <http://purl.org/net/skosmos#> .
@prefix isothes: <http://purl.org/iso25964/skos-thes#> .
@prefix mdrtype: <http://publications.europa.eu/resource/authority/dataset-type/> .
@prefix : <#> .

# Skosmos main configuration

:config a skosmos:Configuration ;
# SPARQL endpoint
# a local Fuseki server is usually on localhost:3030
skosmos:sparqlEndpoint <http://fuseki-cache:80/skosmos/sparql> ;
# use the dev.finto.fi endpoint where the example vocabularies reside
# skosmos:sparqlEndpoint <http://api.dev.finto.fi/sparql> ;
# sparql-query extension, or "Generic" for plain SPARQL 1.1
# set to "JenaText" instead if you use Fuseki with jena-text index
skosmos:sparqlDialect "JenaText" ;
# whether to enable collation in sparql queries
skosmos:sparqlCollationEnabled false ;
# HTTP client configuration
skosmos:sparqlTimeout 20 ;
skosmos:httpTimeout 5 ;
# customize the service name
skosmos:serviceName "Skosmos" ;
# customize the base element. Set this if the automatic base url detection doesn't work. For example setups behind a proxy.
# skosmos:baseHref "http://localhost/Skosmos/" ;
# interface languages available, and the corresponding system locales
skosmos:languages (
[ rdfs:label "ar" ; rdf:value "ar_AE.utf8" ]
[ rdfs:label "da" ; rdf:value "da_DK.utf8" ]
[ rdfs:label "de" ; rdf:value "de_DE.utf8" ]
[ rdfs:label "en" ; rdf:value "en_GB.utf8" ]
[ rdfs:label "en_US" ; rdf:value "en_US.utf8" ]
[ rdfs:label "es" ; rdf:value "es_ES.utf8" ]
[ rdfs:label "fa" ; rdf:value "fa_IR.utf8" ]
[ rdfs:label "fi" ; rdf:value "fi_FI.utf8" ]
[ rdfs:label "fr" ; rdf:value "fr_FR.utf8" ]
[ rdfs:label "it" ; rdf:value "it_IT.utf8" ]
[ rdfs:label "nb" ; rdf:value "nb_NO.utf8" ]
[ rdfs:label "nl" ; rdf:value "nl_NL.utf8" ]
[ rdfs:label "nn" ; rdf:value "nn_NO.utf8" ]
[ rdfs:label "pl" ; rdf:value "pl_PL.utf8" ]
[ rdfs:label "pt" ; rdf:value "pt_PT.utf8" ]
[ rdfs:label "pt_BR" ; rdf:value "pt_BR.utf8" ]
[ rdfs:label "ru" ; rdf:value "ru_RU.utf8" ]
[ rdfs:label "sv" ; rdf:value "sv_SE.utf8" ]
[ rdfs:label "zh" ; rdf:value "zh_CN.utf8" ]
) ;
# how many results (maximum) to load at a time on the search results page
skosmos:searchResultsSize 20 ;
# how many items (maximum) to retrieve in transitive property queries
skosmos:transitiveLimit 1000 ;
# whether or not to log caught exceptions
skosmos:logCaughtExceptions false ;
# set to TRUE to enable logging into browser console
skosmos:logBrowserConsole false ;
# set to a logfile path to enable logging into log file
# skosmos:logFileName "" ;
# a default location for Twig template rendering
skosmos:templateCache "/tmp/skosmos-template-cache" ;
# customize the css by adding your own stylesheet
skosmos:customCss "resource/css/stylesheet.css" ;
# default email address where to send the feedback
skosmos:feedbackAddress "" ;
# email address to set as the sender for feedback messages
skosmos:feedbackSender "" ;
# email address to set as the envelope sender for feedback messages
skosmos:feedbackEnvelopeSender "" ;
# whether to display the ui language selection as a dropdown (useful for cases where there are more than 3 languages)
skosmos:uiLanguageDropdown true ;
# whether to enable the spam honey pot or not, enabled by default
skosmos:uiHoneypotEnabled true ;
# default time a user must wait before submitting a form
skosmos:uiHoneypotTime 5 ;
# plugins to activate for the whole installation (including all vocabularies)
skosmos:globalPlugins () .

# Skosmos vocabularies

:unesco a skosmos:Vocabulary, void:Dataset ;
dc:title "UNESCO Thesaurus"@en ;
skosmos:shortName "UNESCO";
dc:subject :cat_general ;
void:uriSpace "http://skos.um.es/unescothes/";
skosmos:language "en", "es", "fr", "ru";
skosmos:defaultLanguage "en";
skosmos:showTopConcepts true ;
skosmos:fullAlphabeticalIndex true ;
skosmos:groupClass isothes:ConceptGroup ;
void:sparqlEndpoint <http://fuseki-cache:80/skosmos/sparql> ;
skosmos:sparqlGraph <http://skos.um.es/unescothes/> .

:stw a skosmos:Vocabulary, void:Dataset ;
dc:title "STW Thesaurus for Economics"@en ;
skosmos:shortName "STW";
dc:subject :cat_general ;
void:uriSpace "http://zbw.eu/stw/";
skosmos:language "en", "de";
skosmos:defaultLanguage "de";
void:sparqlEndpoint <http://fuseki-cache:80/skosmos/sparql> ;
skosmos:sparqlGraph <http://zbw.eu/stw/> .
Loading