Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Docker files to run self-contained image for trying out Skosmos #962

Merged
merged 30 commits into from
Mar 15, 2021

Conversation

kinow
Copy link
Collaborator

@kinow kinow commented Mar 25, 2020

Closes #910

Initial Docker files to try out Skosmos 2.4. Uses the Ubuntu base image, and installs Apache2, PHP, composer, git. It is similar to the InstallTutorial, but there are still some steps missing. Some of the steps require trying to customize/parameterize the Jena image.

I am creating this PR now so that I can check what else needs to be done.

The Dockerfile.ubuntu is for the Skosmos 2.4 image, that at the moment is using YSO and YSA vocabularies from Finto dev server. I haven't done the extra step of the tutorial, to use different vocabularies as that would require Fuseki. I thought some users would prefer to be able to run just one container to try Skosmos first?

There is also a docker-compose.yml that uses a different config.ttl, which points to the Jena Fuseki container. It is not loading any data at the moment, but I was starting to work on having 1 vocabulary only, then leave the commands to load the data to the user. Not sure what's the best way here, maybe the docker-compose.yml should try to do more, and also load the data (not sure if I can do that), and perhaps complete the rest of the tutorial?

Cheers
Bruno

@codecov
Copy link

codecov bot commented Mar 25, 2020

Codecov Report

Merging #962 (10323ad) into master (c1a339b) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #962   +/-   ##
=========================================
  Coverage     67.91%   67.91%           
  Complexity     1583     1583           
=========================================
  Files            32       32           
  Lines          3890     3890           
=========================================
  Hits           2642     2642           
  Misses         1248     1248           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c1a339b...10323ad. Read the comment docs.

@osma osma added this to the 2.5 milestone Apr 7, 2020
@osma osma self-assigned this Apr 7, 2020
@osma
Copy link
Member

osma commented Apr 7, 2020

Thanks @kinow, testing this now!

@kinow
Copy link
Collaborator Author

kinow commented Apr 7, 2020

Thanks @osma! There's further work that can be done on this. Just let me know which direction you think we should go, and anything that is missing or that needs fixing in the image :-)

dockerfiles/Dockerfile.ubuntu Outdated Show resolved Hide resolved
@osma
Copy link
Member

osma commented Apr 7, 2020

Thanks for the quick fix @kinow !

@osma
Copy link
Member

osma commented Apr 7, 2020

Got both the single image and the docker-compose variants running. I will do more detailed testing tomorrow and try to come up with answers to your questions!

Copy link
Member

@osma osma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good in general and already works quite well. Some remarks:

I'm a bit concerned about the proliferation of Docker images. We already have the top-level Dockerfile that sets up a "shell" where the Skosmos code can be mounted. This PR adds the single-image and Compose variants. I wonder if it would be possible to simplify the situation and perhaps replace the current "shell" Dockerfile with this new variant - maybe with an option for mounting the Skosmos code within the container instead of copying.

Regarding the Composer variant, I think this is a good start but I'd like to see more:

  • Setting up a jena-text index
  • Putting the example vocabularies (STW and UNESCO) in place

Also I'm wondering whether it would make sense to add another container to the Compose file with a reverse HTTP cache - Varnish or nginx, probably - to speed up the Fuseki queries, as explained in the tutorial. But maybe we'll postpone that discussion.

Keep up the good work!

dockerfiles/Dockerfile.ubuntu Outdated Show resolved Hide resolved
dockerfiles/Dockerfile.ubuntu Show resolved Hide resolved
dockerfiles/Dockerfile.ubuntu Outdated Show resolved Hide resolved
dockerfiles/docker-compose.yml Outdated Show resolved Hide resolved
@kinow
Copy link
Collaborator Author

kinow commented Apr 8, 2020

Hi @osma ! Thanks for the great feedback. And lots of good ideas, so +1 from me. Next updates here should have:

  • Skosmos
    • Try to combine the existing Dockerfile and docker-compose with the files in this PR, to reduce/simplify the Docker files in this repository
      • perhaps we could have a shell entrypoint that by default uses the current commit of this repository (not sure if doable, but worth investigating), or the user can provide a tag/commit. maybe users should also be able to ask it to enable xdebug or extra modules... 🤔
    • Set timezone to UTC
    • Add the APC module to PHP (php-apcu)
    • Instead of using a fixed version like 2.4, use the current code to build the project (related to first item ☝️ )
    • Putting the example vocabularies (STW and UNESCO) in place (i.e. in config.ttl)
  • Jena/Fuseki
    • Use a port different than 3030 for the Fuseki image
    • Set up jena-text
    • Putting the example vocabularies (STW and UNESCO) in place (i.e. load data into fuseki/jena-text) (EDIT: this is not easy with the current jena-fuseki image, and easier to be done manually, as in the tutorial)
  • Bonus
    • reverse HTTP cache - Varnish or nginx, probably - to speed up the Fuseki queries, as explained in the tutorial

@osma these are the requirements I could think of from the reviews so far. What do you think?

@osma osma marked this pull request as draft April 9, 2020 05:47
@osma
Copy link
Member

osma commented Apr 9, 2020

Sounds like a great plan @kinow!

Try to combine the existing Dockerfile and docker-compose with the files in this PR, to reduce/simplify the Docker files in this repository

Just to clarify: I think it would be great to have:

  1. A Dockerfile, which by default creates a stand-alone image that relies on the dev.finto.fi SPARQL endpoint. The Skosmos code is copied from the surrounding repo.
  2. A docker-compose.yml file which uses the image created by the above Dockerfile, but also includes a Fuseki container (with jena-text index and the example vocabularies STW & UNESCO) and changes the configuration of the Skosmos container to use this local Fuseki instead of dev.finto.fi.
  3. A variant of 1. (perhaps just explained in the documentation) where the Skosmos code and config.ttl is mounted from the surrounding repo, so that it remains editable. This would be appropriate for development purposes, so that one can quickly test code changes without having to rebuild the whole container. It would replace the existing top-level Dockerfile.

PS. I just had to test the new "convert to draft PR" button on this PR. Draft PRs are a great feature, but originally you had to mark PRs as drafts when you created them, which I often forgot, and it wasn't possible to change that later.

@kinow
Copy link
Collaborator Author

kinow commented Apr 9, 2020

Agreed on the Docker files. Should we move everything under dockerfiles?

PS. I just had to test the new "convert to draft PR" button on this PR. Draft PRs are a great feature, but originally you had to mark PRs as drafts when you created them, which I often forgot, and it wasn't possible to change that later.

Today I learned. Didn't see any announcement of that feature. Thanks!!

@osma
Copy link
Member

osma commented Apr 9, 2020

Should we move everything under dockerfiles?

Yes, I think it's a good idea to reduce the clutter at the root level.

Today I learned. Didn't see any announcement of that feature. Thanks!!

Me neither, I just noticed the link in the right hand column.

@miguelvaara miguelvaara modified the milestones: 2.5, 2.6 Apr 21, 2020
@osma
Copy link
Member

osma commented Apr 30, 2020

Any news @kinow? We're just starting a new sprint next week (though focused on API issues).

@kinow
Copy link
Collaborator Author

kinow commented May 1, 2020

Not much progress, I have some local changes in the other repository I had for docker+skosmos from when I was testing building an image with parameters. Refreshing my memory and will update it soon. I just finished working on two other images (SLURM and another interesting one to run multiple versions of bash). Gotta review the current image too to see if there are any improvements I could do :-) great to hear about the new sprint!

@kinow kinow force-pushed the fix-910 branch 4 times, most recently from ae56788 to 8033ce7 Compare May 2, 2020 03:08
@kinow
Copy link
Collaborator Author

kinow commented May 2, 2020

@osma

Putting the example vocabularies (STW and UNESCO) in place (i.e. load data into fuseki/jena-text)

Did a bit more of progress today, but had some issues with this item. I first tried using the load.sh or tdbloader as instructions from jena-fuseki container. However, these require only one access per time to the fuseki DB. So it's harder to automate with docker-compose.

Tried adding an extra container responsible only for loading the data. But the issue now was that I had to either include the two .ttl (~7MB total), or gzipped (<1MB). But then had to automate decompressing the files, loading into the volume, and stopping the container to free the lock to Fuseki in another container.

It started to look a bit too convoluted this set up.

I've downloaded both ttl from their vocabulary pages. Then loaded with

curl -I -X POST -H Content-Type:text/turtle -T unescothes.ttl -G http://localhost:9030/skosmos/data --data-urlencode graph=http://skos.um.es/unescothes/
curl -I -X POST -H Content-Type:text/turtle -T stw.ttl -G http://localhost:9030/skosmos/data --data-urlencode graph=http://zbw.eu/stw/

Would it be all right if users had to start docker-compose up followed by some manual steps to load the data into Fuseki/Jena? This way we wouldn't need to include the .ttl source data, and the setup would be a bit simpler.

@osma
Copy link
Member

osma commented May 4, 2020

Would it be all right if users had to start docker-compose up followed by some manual steps to load the data into Fuseki/Jena? This way we wouldn't need to include the .ttl source data, and the setup would be a bit simpler.

I think that makes sense, if the alternative is so complicated.

@kinow
Copy link
Collaborator Author

kinow commented May 8, 2020

Got it working with most of the new requirements. Will look at varnish this weekend, give it some more testing, check what docs need to be updated, then it should be ready for review again 👍

@osma
Copy link
Member

osma commented May 8, 2020

Great! Thanks for the update @kinow, I was just wondering about this PR

@kinow
Copy link
Collaborator Author

kinow commented Mar 11, 2021

(rebased, and included a new commit to document the --net=host and port 80 use)

@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
3.9% 3.9% Duplication

@osma osma merged commit f7987d3 into NatLibFi:master Mar 15, 2021
@osma
Copy link
Member

osma commented Mar 15, 2021

Tested this once more, removed lines referring to Finto SPARQL endpoint from the config files (they were already commented out), and merged. 🎉 Great work @kinow!

Now the wiki page for Docker use still needs to be updated. The README.md file included in this PR is already pretty extensive, so maybe it can mostly just be replaced with a link to that? @kinow do you want to do this or should I?

@kinow kinow deleted the fix-910 branch March 15, 2021 18:26
@kinow
Copy link
Collaborator Author

kinow commented Mar 15, 2021

Tested this once more, removed lines referring to Finto SPARQL endpoint from the config files (they were already commented out), and merged. tada Great work @kinow!

🎉 thanks for the patience reviewing and testing it, and for the help along the way @osma, it was fun :-)

Now the wiki page for Docker use still needs to be updated. The README.md file included in this PR is already pretty extensive, so maybe it can mostly just be replaced with a link to that? @kinow do you want to do this or should I?

I did a quick change, but feel free to amend/edit it as necessary @osma : https://github.com/NatLibFi/Skosmos/wiki/Install-Skosmos-with-Fuseki-in-Docker

@osma
Copy link
Member

osma commented Mar 16, 2021

I did a quick change, but feel free to amend/edit it as necessary @osma : https://github.com/NatLibFi/Skosmos/wiki/Install-Skosmos-with-Fuseki-in-Docker

Thanks! I edited it slightly. All good now!

@osma osma modified the milestones: Next Tasks, 2.10 Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Self-contained Docker image for trying out Skosmos
4 participants