This repo contains Blacklight 7.31.0 customized for Digital Scriptorium
- /images contains custom images for the Digital Scriptorium UI
- /stylesheets contains custom CSS for for the Digital Scriptorium UI
- ds_document_metadata_component (.rb + .html.erb)
- ds_metadata_field_component (.rb + .html.erb)
- ds_search_bar_component (.rb + .html.erb)
- catalog_controller.rb contains configuration for
- -- advanced_search
- -- document title (index and show)
- -- show_tools
- -- nav_action
- -- facet_field (all metadata displayed in facet sidebar)
- -- index_field (title, author, place, date)
- -- show_field (all metadata displayed in full record view)
- -- search_field (all fields displayed in simple search drop-down + advanced search page)
- application_helper.rb contains custom UI functions
- /components contains custom JS functions for
- -- advanced search form
- -- pop-up alert bar
- -- copy to clipboard in full record view
- application.js contains Mirador integration
- /concerns/solr_document.rb contains custom accessor methods used in full record view
- /presenters contains customized Ruby functions to modify default document title display
- /advanced contains customized/overriden views for advanced search
- /blacklight/nav contains customized/overridden views for navigation
- /catalog contains customized/overridden views for
- -- citation
- -- homepage
- -- search form
- -- "show" page (full record view)
- -- Links sidebar widget (full record view)
- -- Contact Institution sidebar widget
- -- sidebar widget config
- /shared contains customized/overridden views for
- -- pop-up alert (beta notice)
- -- header navbar
- -- footer
- -- copy to clipboard icon
- solr-schema.yml contains custom dynamic fields (below)
- solr-seed.json contains Wikibase data from 2023-03-17
<dynamicField name="*_display" type="string" multiValued="true" indexed="true" stored="true"/>
<dynamicField name="*_search" type="text" multiValued="true" indexed="true" stored="true"/>
<dynamicField name="*_facet" type="string" docValues="true" multiValued="true" indexed="true" stored="true"/>
<dynamicField name="*_meta" type="string" multiValued="true" indexed="true" stored="true"/>
<dynamicField name="*_link" type="string" multiValued="true" indexed="true" stored="true"/>
<dynamicField name="*_int" type="int" multiValued="true" indexed="true" stored="true"/>
- name: "*_display"
type: "string"
multiValued: true
indexed: true
stored: true
- name: "*_search"
type: "text"
multiValued: true
indexed: true
stored: true
- name: "*_facet"
type: "string"
docValues: true
multiValued: true
indexed: true
stored: true
- name: "*_meta"
type: "string"
multiValued: true
indexed: true
stored: true
- name: "*_link"
type: "string"
multiValued: true
indexed: true
stored: true
- name: "*_int"
type: "int"
multiValued: true
indexed: true
stored: true
wikibase-to-solr.rb
The fieldnames are used as the root of the Solr fieldname, combined with the appropriate dynamic field as outlined above.
In Wikibase:
- P1 = "DS ID"
In property-names.csv:
- P1 = "id"
After wikibase-to-solr.rb, in import.json:
- id, id_display, id_search
- P1,id
- P2,manuscript_holding
- P3,described_manuscript
- P4,institution_authority
- P5,institution
- P6,holding_status
- P7,institutional_id
- P8,shelfmark
- P9,institutional_record
- P10,title
- P11,standard title
- P12,uniform_title
- P13,original_script
- P14,associated_name
- P15,role_authority
- P16,instance_of
- P17,name_authority
- P18,term
- P19,subject
- P20,term_authority
- P21,language
- P22,language_authority
- P23,date
- P24,century_authority
- P25,century
- P26,dated
- P27,place
- P28,place_authority
- P29,physical_description
- P30,material
- P31,material
- P32,note
- P33,acknowledgements
- P34,date_added
- P35,date_updated
- P36,latest
- P37,earliest
- P38,start_time
- P39,end_time
- P40,external_identifier
- P41,iiif_manifest
- P42,wikidata_qid
- P43,viaf_id
- P44,external_uri
- P45,equivalent_property
- P46,formatter_url
- P47,subclass_of
- _display (has LD syntax structure, needs to be parsed with Blacklight)
- _search (for text search, tokenized)
- _facet (for displaying in sidebar facets, not tokenized)
- _link (for displaying as a hyperlink)
- _int (for dates)
- _meta (for plain text data)
- require ruby libraries
- configure field output arrays (by P-id)
- configure general settings
- define custom functions
- load JSON
- load property-names.csv into a lookup array
- first EACH-DO = populate lookup arrays (labels, uris, p2records, p3records)
- second EACH-DO = main loop
- fetch Wikibase item id
- merge ids (3 Wikibase records become 1 merged Solr record)
- load the claims
- when the item matches instance_of=1,2,3, parse
- evaluate all properties in the claims array
- if the property contains qualifiers, evaluate all qualifiers inside that property
- data transformation rules and logic for special cases (P14, P23, P25, P36, P37, P30, P31)
- output $solrObjects array as JSON to file
Pull the latest changes from WikiBase export Git repository.
force
[Boolean] (optional) Continue even if there are no changes to the Wiki JSON export file.
WIKIBASE_REPOSITORY_PATH
(default: '../ds_exports') - The relative path, relative toRails.root
directory, to the local Wikibase Git instance.WIKIBASE_REPOSITORY_URL
(default: 'https://github.com/DigitalScriptorium/ds-exports') - The remote location to the Wikibase export Git repository.WIKIBASE_EXPORT_JSON_FILE
(default: 'json/ds-latest.json') - The relative location in the Git repository to the JSON export file.
- 0 - Success
- 1 - No changes
- Local Git directory exists?
- Clone if does not exist.
- Execute a
git pull --ff-only
- Get the export JSON file SHA1 hash from Git using
git object-hash [file]
- Check to see if the hash exists in the
wikibase_export_versions
table- Hash exists
- Exit with code 1 if the
force
parameter isfalse
; exit 0 iftrue
- Exit with code 1 if the
- Hash does not exist
- Add a new record in the
wikibase_export_versons
table - Exit with code 0
- Add a new record in the
- Hash exists
Converts the Wikibase JSON export file using the Wikibase to Solr Script.
ouput
[String] (optional, default: 'tmp/solr_data.json') - The location relative toRails.root
to write the solr document JSON file.input
[String] (optional, default: '../ds_exports/json/ds-latest.json') - The location relative toRails.root
to the Wikibase export JSON file.verbose
[Boolean] (optional) - Write the debug output to STDOUT.
WIKIBASE_REPOSITORY_PATH
(default: '../ds_exports') - The relative path, relative toRails.root
directory, to the local Wikibase Git instance.WIKIBASE_EXPORT_JSON_FILE
(default: 'json/ds-latest.json') - The relative location in the Git repository to the JSON export file.
- 0 - Success
- 1 - Output file does not exist
- Delete the
output
file if it exists - Execute
ruby wikibase-to-solr.rb -o [output] -i [input]
- Exit 1 if
output
file does not exist; exit 0 if it does
Safely seeds the Solr collection using the file
JSON documents by making a backup of the collection before deleting all the records and uploading the new documents.
file
[String] (optional, default: 'tmp/solr_data.json') - The location relative toRails.root
to the solr document JSON file.
SOLR_URL
(required) - The Solr URI to the applications collection.SOLR_BACKUP_TIMEOUT
(default: 5 minutes) - The number of seconds to wait for the Solr collection backup & restore commands to finish.SOLR_BACKUP_WAIT_INTERVAL
(default: 1 minute) - The number of seconds to wait before checking on the backup & restore command status.SOLR_BACKUP_LOCATION
(required) - The shared drive location to store the Solr Backups (Note: All instances of Solr & Zookeeper need read & write access to the shared drive. See https://solr.apache.org/guide/6_6/collections-api.html#CollectionsAPI-backup)
- 0 Success
- Parse the collection name from
ENV['SOLR_URL']
- Validate and load the Solr documents in
file
into memory - Create the solr collection backup and wait for the command to finish
- raise an error if the command does not finish in the allotted time
- In a block
- Delete all the solr documents in the collection
- Upload the new solr documents
- Rescue from an exception
- Restore the solr collection and wait for the command to finish
- raise an error if the command does not finish in the allowed time
- Reraise the exception
- Restore the solr collection and wait for the command to finish
Executes the data pipeline from start to finish. See individual commands above for more details.
force
[Boolean] (optional) - Execute even if there are no changes to the Wikibase export JSON file.
- Execute
rake data:ingest[force]
- Execute
rake data:convert
- Execute
rake data:seed
Note: If any command fails (non 0 exit code), the process will stop and exit with the same code.
The whenever
GEM is used to schedule the CRON job on the worker container in Docker.
The CRON job executes the rake data:migrate
.
The CRON job is configured in the config/schedule.rb
every '0 0 * * *', mailto: '[email protected]' do
rake "data:migrate"
end
The whenever
GEM has a built in functionality to email the CRON job output.
Change the mailto
parameter in the CRON job configuration to the desired email recipient.
every '0 0 * * *', mailto: '[email protected]' do
rake "data:migrate"
end
See https://github.com/javan/whenever#customize-email-recipient-with-the-mailto-environment-variable for more details.
- Clone this repo
gh repo clone jefawks3/hxs-blacklight
- Open the Terminal and CD to repo
cd hxs-blacklight
- Run
docker-compose build
- Run
docker-compose run --rm app bundle install -j8
- Run
docker-compose run --rm app bundle exec rake db:migrate
- Run
docker-compose up
- Log into Solr http://localhost:8983/solr
- Create the
blacklight-core
collection (not core) - Update the schema to the
blacklight-core
collection if needed - Run
docker-compose run --rm app bundle exec rake data:migrate[true]
- Open http://localhost:3000 in the browser
- Code - No Action - Code changes should be detected by Rails
- Migrations - Run
docker-compose run --rm app bundle exec rake db:migrate
- Solr Schema Changes - Run
docker-compose run --rm app bundle exec rake solr1:schema:update
- Gemfile - Rerun
docker-compose run --rm app bundle install
and restartapp
container
- Web - http://localhost:3000
- Solr - http://localhost:8983/solr
- Postgres - http://localhost:5432
Notes:
- You only need to rebuild any images that have changes.
- ECS is configured to run on Linux x86_64 architecture; make sure to specify the platform when building.
- Once you have pushed your images to ECR, you will need to deploy the images via ECS.
- Login to AWS ECR
aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 214159447841.dkr.ecr.us-east-2.amazonaws.com
- Build Image
ENV RAILS_MASTER_KEY=[MASTER_KEY] docker buildx build --platform linux/x86_64 -t hxs-blacklight-app --file .docker/rails.prod.Dockerfile --secret id=master_key,env=RAILS_MASTER_KEY --build-arg RAILS_PORT=80 --no-cache .
- Tag Image
docker tag hxs-blacklight-app:latest 214159447841.dkr.ecr.us-east-2.amazonaws.com/hxs-blacklight-app:latest
- Push Image
docker push 214159447841.dkr.ecr.us-east-2.amazonaws.com/hxs-blacklight-app:latest
- Login to AWS ECR
aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 214159447841.dkr.ecr.us-east-2.amazonaws.com
- Build Image
docker buildx build --platform linux/x86_64 -t hxs-blacklight-solr --file .docker/solr.prod.Dockerfile --no-cache .
- Tag Image
docker tag hxs-blacklight-solr:latest 214159447841.dkr.ecr.us-east-2.amazonaws.com/hxs-blacklight-solr:latest
- Push Image
docker push 214159447841.dkr.ecr.us-east-2.amazonaws.com/hxs-blacklight-solr:latest
NOTE: Due to the way ECS handles deployment, unless you are incrementing the Task definition version, you will need to use the Force Deployment in Tips & Tricks.
-
Log in to ECS
-
Select Clusters
-
Select
hxs-blacklight-[environment]
-
Select
hxs-blacklight
under Services -
Click on
Update Service
-
Make the necessary changes
You may need to select
Force new deployment
under Deployment Options -
Click on
Update
- Migrate Database:
aws ecs execute-command --region us-east-2 --cluster hxs-blacklight-staging --task [TASK_ID] --container app --command "bundle exec rake db:migrate" --interactive
- Sync Solr Schema:
aws ecs execute-command --region us-east-2 --cluster hxs-blacklight-staging --task [TASK_ID] --container app --command "bundle exec rake solr:schema:update" --interactive
- Force a deployment:
aws ecs update-service --force-new-deployment --region us-east-2 --cluster hxs-blacklight-staging --service hxs-blacklight
Error saving credentials: error storing credentials - err: exit status 1, out: 'Post "http://ipc/registry/credstore-updated": dial unix backend.sock: connect: no such file or directory'
Make sure that the docker server is running.
- Log into AWS
- Navigate to the ECS Console
- Select Clusters
- Select hxs-blacklight-staging
- Under Services select hxs-blacklight
- Click on the Configuration and Tasks
- Under the Tasks Panel, select the top most task
- Under the Configuration look for the public IP
- Copy the public IP and past it into the browser
- Change the port to 8983