Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Valkyrie Indexer Setup and Specs #4221

Closed
wants to merge 59 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
6868bb4
Add indexer to work_resource generator
Jan 25, 2020
69aba72
rename index_hash to index_document for consistency
Jan 25, 2020
961aeb8
Use the configured storage adapter
Jan 25, 2020
0b21d22
indexer setup, including fixes / tests for activefedoraconverter and …
Jan 25, 2020
e4acc51
active_fedora_converter etc. changes
Jan 25, 2020
b9d3689
add registration of indexers into an initializer
Jan 26, 2020
fc31e18
remove additional test from valkyrie_indexer_spec.rb - this has moved…
Jan 26, 2020
ed0ab01
subscribe the MetadataIndexListener in config/initializers/listeners.…
Jan 24, 2020
48e7525
Refactor `CollectionMemberSearchBuilder` to better use `FilterByType`
Jan 27, 2020
f11054b
Test `CollectionMemberSearchBuilder` with a Valkyrie collection
Jan 27, 2020
bccd86e
Merge pull request #4217 from samvera/storage-adapter-conf
jeremyf Jan 27, 2020
fc1f7dc
handle custom use types for files
elrayle Jan 24, 2020
6c51756
add tests for custom query methods; refactor file use
elrayle Jan 24, 2020
fd2d89d
refactor FileMetadata#use to #type
elrayle Jan 24, 2020
ba4d8f7
Still spicy?
Jan 25, 2020
6e3d83a
Add a vegan option
cjcolvar Jan 27, 2020
930b5fe
use guard clause; bump regen
Jan 27, 2020
3cbcc99
mark versioning pending; return first for original_file, extracted_te…
elrayle Jan 27, 2020
c7e8353
Merge pull request #4219 from samvera/generate-indexer
jeremyf Jan 27, 2020
4252be9
Merge pull request #4224 from samvera/cjcolvar-patch-1
straleyb Jan 27, 2020
bdd75c1
bump regen
Jan 27, 2020
408ae2e
prefer use of each_with_object
elrayle Jan 27, 2020
c8e91a5
use constants for primary file use/type
elrayle Jan 28, 2020
fc402da
prefer .map over .each_with_object
elrayle Jan 28, 2020
093ac4a
Merge pull request #4225 from samvera/wings/custom_file_assoc
straleyb Jan 28, 2020
eb5004b
Setup RSpec metadata for selecting a Valkyrie adapter
Jan 25, 2020
f93029a
Create `FileSetDescription` to find characterization for FileSets
Jan 25, 2020
6edec00
Move PCDM Use URI constants to `FileMetadata::Use`
Jan 28, 2020
9989e38
Set a default `FileMetadata#type`
Jan 28, 2020
a65e59a
Register the new `:find_many_file_metadata_by_use` query
Jan 28, 2020
bb79e44
Drop `Hyrax::FileMetadata#used_for?`
Jan 28, 2020
52d8338
Remove `FileSet` type convienence methods
Jan 28, 2020
e8e1069
Implement Valkyrie indexing for Collections
Jan 27, 2020
e50f1de
Implement indexing for collection thumbnails
Jan 27, 2020
e588855
Update app/services/hyrax/thumbnail_path_service.rb
Jan 28, 2020
2a9270a
Refactor FileActor handling of PCDM Use vocabulary
Jan 28, 2020
98298fe
Correct the ThumbnailImage URI in the PCDM Use vocab
Jan 28, 2020
a749f23
Introduce `Hyrax.custom_queries`
Jan 28, 2020
23e3a25
Merge pull request #4223 from samvera/valkyrie_collections_controller
straleyb Jan 29, 2020
8c7cb5d
Removes configurability for resource indexer in solr indexing adapter
straleyb Jan 27, 2020
2d97b68
updated lint regarding no params on method call
straleyb Jan 27, 2020
2214537
update solr indexing adapter to use valkyrie indexer
straleyb Jan 27, 2020
74428d6
Remove attribute reader for resource_indexer
straleyb Jan 28, 2020
ab432d4
Use resource_indexer instance variable
straleyb Jan 28, 2020
fe7bd63
remove unnecessary instance variable
straleyb Jan 29, 2020
fa429e7
Removing Rails 6.0 deprecation warning
jeremyf Jan 30, 2020
226e010
Merge pull request #4231 from samvera/removing-constraints-deprecatio…
straleyb Jan 30, 2020
1d5c2f2
Merge pull request #4215 from samvera/valk-char
elrayle Jan 30, 2020
c90215c
Valkyrize inherit_permissions_job
bkeese Jan 28, 2020
8a597a2
Support custom URI use/type in `FileSetDescription`
Jan 30, 2020
abe5383
Merge pull request #4235 from samvera/use-uri-use
jeremyf Jan 31, 2020
279ccb8
Ensuring CSV usage has same namespace consideration
jeremyf Jan 31, 2020
da3b463
Merge pull request #4238 from samvera/using-same-namespace-convention…
jeremyf Jan 31, 2020
6fc0b59
Merge pull request #4227 from samvera/inherit_permissions_job
jeremyf Jan 31, 2020
0d098b6
indexer setup, including fixes / tests for activefedoraconverter and …
Jan 25, 2020
e122dd0
active_fedora_converter etc. changes
Jan 25, 2020
6565c88
remove additional test from valkyrie_indexer_spec.rb - this has moved…
Jan 26, 2020
12fa4d0
Set up ValkyrieWorkIndexer in initializer to register indexer properly
straleyb Jan 31, 2020
09fe003
Merge branch 'valkyrie-indexer' of github.com:samvera/hyrax into valk…
straleyb Jan 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .regen
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4
5
32 changes: 18 additions & 14 deletions app/actors/hyrax/actors/file_actor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def perform_ingest_file_through_active_fedora(io)
def perform_ingest_file_through_valkyrie(io)
# Skip versioning because versions will be minted by VersionCommitter as necessary during save_characterize_and_record_committer.
unsaved_file_metadata = io.to_file_metadata
unsaved_file_metadata.use = relation
unsaved_file_metadata.type = [relation]
begin
saved_file_metadata = file_metadata_builder.create(io_wrapper: io, file_metadata: unsaved_file_metadata, file_set: file_set)
rescue StandardError => e # Handle error persisting file metadata
Expand All @@ -100,24 +100,28 @@ def normalize_relation(relation)
end

def normalize_relation_for_active_fedora(relation)
return relation if relation.is_a? Symbol
return relation.to_sym if relation.respond_to? :to_sym

# TODO: whereever these are set, they should use Valkyrie::Vocab::PCDMUse... making the casecmp unnecessary
return :original_file if relation.to_s.casecmp(Valkyrie::Vocab::PCDMUse.original_file.to_s)
return :extracted_file if relation.to_s.casecmp(Valkyrie::Vocab::PCDMUse.extracted_file.to_s)
return :thumbnail_file if relation.to_s.casecmp(Valkyrie::Vocab::PCDMUse.thumbnail_file.to_s)
:original_file
case relation
when Hyrax::FileMetadata::Use::ORIGINAL_FILE
:original_file
when Hyrax::FileMetadata::Use::EXTRACTED_TEXT
:extracted_file
when Hyrax::FileMetadata::Use::THUMBNAIL
:thumbnail_file
else
:original_file
end
end

##
# @return [RDF::URI]
def normalize_relation_for_valkyrie(relation)
# TODO: When this is fully switched to valkyrie, this should probably be removed and relation should always be passed
# in as a valid URI already set to the file's use
relation = relation.to_s.to_sym
return Valkyrie::Vocab::PCDMUse.original_file if relation == :original_file
return Valkyrie::Vocab::PCDMUse.extracted_file if relation == :extracted_file
return Valkyrie::Vocab::PCDMUse.thumbnail_file if relation == :thumbnail_file
Valkyrie::Vocab::PCDMUse.original_file
return relation if relation.is_a?(RDF::URI)

Hyrax::FileMetadata::Use.uri_for(use: relation.to_sym)
rescue ArgumentError
Hyrax::FileMetadata::Use::ORIGINAL_FILE
end
end
end
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ def update
filter_docs_with_edit_access!
copy_visibility = []
copy_visibility = params[:embargoes].values.map { |h| h[:copy_visibility] } if params[:embargoes]
af_objects = Hyrax.query_service.custom_queries.find_many_by_alternate_ids(alternate_ids: batch, use_valkyrie: false)
af_objects = Hyrax.custom_queries.find_many_by_alternate_ids(alternate_ids: batch, use_valkyrie: false)
af_objects.each do |curation_concern|
Hyrax::Actors::EmbargoActor.new(curation_concern).destroy
# if the concern is a FileSet, set its visibility and visibility propagation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def update
filter_docs_with_edit_access!
copy_visibility = []
copy_visibility = params[:leases].values.map { |h| h[:copy_visibility] } if params[:leases]
af_objects = Hyrax.query_service.custom_queries.find_many_by_alternate_ids(alternate_ids: batch, use_valkyrie: false)
af_objects = Hyrax.custom_queries.find_many_by_alternate_ids(alternate_ids: batch, use_valkyrie: false)
af_objects.each do |curation_concern|
Hyrax::Actors::LeaseActor.new(curation_concern).destroy
Hyrax::VisibilityPropagator.for(source: curation_concern).propagate if
Expand Down
2 changes: 1 addition & 1 deletion app/controllers/hyrax/batch_edits_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def update_document(obj)
obj.attributes = work_params(admin_set_id: obj.admin_set_id).except(*visibility_params)
obj.date_modified = Time.current.ctime

InheritPermissionsJob.perform_now(obj)
InheritPermissionsJob.perform_now(obj, use_valkyrie: false)
VisibilityCopyJob.perform_now(obj)

obj.save
Expand Down
20 changes: 20 additions & 0 deletions app/indexers/hyrax/valkyrie_collection_indexer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# frozen_string_literal: true

module Hyrax
##
# Indexes properties common to PCDM Collections
class ValkyrieCollectionIndexer < Hyrax::ValkyrieIndexer
Hyrax::ValkyrieIndexer.register self, as_indexer_for: Hyrax::PcdmCollection

include Hyrax::ResourceIndexer
include Hyrax::PermissionIndexer
include Hyrax::VisibilityIndexer

def to_solr
super.tap do |index_document|
index_document[:generic_type_sim] = ['Collection']
index_document[:thumbnail_path_ss] = Hyrax::CollectionThumbnailPathService.call(resource)
end
end
end
end
12 changes: 6 additions & 6 deletions app/indexers/hyrax/valkyrie_indexer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ module Hyrax
# Custom indexers inheriting from others are responsible for providing a full
# index hash. A common pattern for doing this is to employ method composition
# to retrieve the parent's data, then modify it:
# `def to_solr; super.tap { |index_hash| transform(index_hash) }; end`.
# `def to_solr; super.tap { |index_document| transform(index_document) }; end`.
# This technique creates infinitely composible index building behavior, with
# indexers that can always see the state of the resource and the full current
# index document.
Expand All @@ -24,9 +24,9 @@ module Hyrax
# @example defining a custom indexer with composition
# class MyIndexer < ValkyrieIndexer
# def to_solr
# super.tap do |index_hash|
# index_hash[:my_field_tesim] = resource.my_field.map(&:to_s)
# index_hash[:other_field_ssim] = resource.other_field
# super.tap do |index_document|
# index_document[:my_field_tesim] = resource.my_field.map(&:to_s)
# index_document[:other_field_ssim] = resource.other_field
# end
# end
# end
Expand All @@ -40,8 +40,8 @@ module Hyrax
# Hyrax::ValkyrieIndexer.register self, as_indexer_for: Book
#
# def to_solr
# super.tap do |index_hash|
# index_hash[:author_si] = resource.author
# super.tap do |index_document|
# index_document[:author_si] = resource.author
# end
# end
# end
Expand Down
2 changes: 2 additions & 0 deletions app/indexers/hyrax/valkyrie_work_indexer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@ module Hyrax
##
# Indexes Hyrax::Work objects
class ValkyrieWorkIndexer < Hyrax::ValkyrieIndexer
# Registration needs to happen for each object that is generated through the system
Hyrax::ValkyrieIndexer.register self, as_indexer_for: Hyrax::Work

include Hyrax::ResourceIndexer
include Hyrax::PermissionIndexer
include Hyrax::VisibilityIndexer
include Hyrax::Indexer(:core_metadata)
include Hyrax::Indexer(:basic_metadata)
end
end
2 changes: 1 addition & 1 deletion app/jobs/concerns/hyrax/members_permission_job_behavior.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ def file_set_ids(work)
when ActiveFedora::Base
::FileSet.search_with_conditions(id: work.member_ids).map(&:id)
when Valkyrie::Resource
Hyrax.query_service.custom_queries.find_child_fileset_ids(resource: work)
Hyrax.custom_queries.find_child_fileset_ids(resource: work)
end
end

Expand Down
74 changes: 62 additions & 12 deletions app/jobs/inherit_permissions_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,71 @@ class InheritPermissionsJob < Hyrax::ApplicationJob
# Perform the copy from the work to the contained filesets
#
# @param work containing access level and filesets
def perform(work)
work.file_sets.each do |file|
attribute_map = work.permissions.map(&:to_hash)
# @param use_valkyrie [Boolean] whether to use valkyrie support
def perform(work, use_valkyrie: Hyrax.config.use_valkyrie?)
if use_valkyrie
valkyrie_perform(work)
else
af_perform(work)
end
end

private

# Return array of hashes representing permissions without their :access_to objects
# @param permissions [Permission]
# @return [Array<Hash>]
def permissions_map(permissions)
permissions.collect { |p| { agent: agent_object(p.agent), mode: p.mode } }
end

# Returns a list of member file_sets for a work
# @param work [Resource]
# @return [Array<Hyrax::File_Set>]
def file_sets_for(work)
Hyrax.query_service.custom_queries.find_child_filesets(resource: work)
end

# Converts string representation of Permission.agent to either User or Hyrax::Group
# @param agent [String]
# @return [User] or [Hyrax::Group]
def agent_object(agent)
return Hyrax::Group.new(agent.sub(Hyrax::Group.name_prefix, '')) if agent.starts_with?(Hyrax::Group.name_prefix)
User.find_by_user_key(agent)
end

# copy and removed access to the new access with the delete flag
file.permissions.map(&:to_hash).each do |perm|
unless attribute_map.include?(perm)
perm[:_destroy] = true
attribute_map << perm
# Perform the copy from the work to the contained filesets
#
# @param work containing access level and filesets
def af_perform(work)
attribute_map = work.permissions.map(&:to_hash)
work.file_sets.each do |file|
# copy and removed access to the new access with the delete flag
file.permissions.map(&:to_hash).each do |perm|
unless attribute_map.include?(perm)
perm[:_destroy] = true
attribute_map << perm
end
end
# apply the new and deleted attributes
file.permissions_attributes = attribute_map
file.save!
end
end

# apply the new and deleted attributes
file.permissions_attributes = attribute_map
file.save!
# Perform the copy from the work to the contained filesets
#
# @param work containing access level and filesets
def valkyrie_perform(work)
work_permissions = permissions_map(work.permission_manager.acl.permissions)
file_sets_for(work).each do |file|
file_acl = Hyrax::AccessControlList.new(resource: file)
file_permissions = permissions_map(file_acl.permissions)
# grant new work permissions to member file_sets
(work_permissions - file_permissions).each { |perm| file_acl.grant(perm[:mode]).to(perm[:agent]).save }
# remove permissions that are not on work from member file_sets
(file_permissions - work_permissions).each { |perm| file_acl.revoke(perm[:mode]).from(perm[:agent]).save }
file_acl.save
end
end
end
end
2 changes: 1 addition & 1 deletion app/models/concerns/hyrax/collection_behavior.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def collection_type=(new_collection_type)
# add_member_objects using the member_of_collections relationship. Deprecate?
def add_members(new_member_ids)
return if new_member_ids.blank?
members << Hyrax.query_service.custom_queries.find_many_by_alternate_ids(alternate_ids: new_member_ids, use_valkyrie: false)
members << Hyrax.custom_queries.find_many_by_alternate_ids(alternate_ids: new_member_ids, use_valkyrie: false)
end

# Add member objects by adding this collection to the objects' member_of_collection association.
Expand Down
55 changes: 46 additions & 9 deletions app/models/hyrax/file_metadata.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,47 @@

module Hyrax
class FileMetadata < Valkyrie::Resource
GENERIC_MIME_TYPE = 'application/octet-stream'

##
# Constants for PCDM Use URIs; use these constants in place of hard-coded
# URIs in the `::Valkyrie::Vocab::PCDMUse` vocabulary.
module Use
ORIGINAL_FILE = ::Valkyrie::Vocab::PCDMUse.OriginalFile
EXTRACTED_TEXT = ::Valkyrie::Vocab::PCDMUse.ExtractedText
THUMBNAIL = ::Valkyrie::Vocab::PCDMUse.ThumbnailImage

##
# @param use [RDF::URI, Symbol]
#
# @return [RDF::URI]
# @raise [ArgumentError] if no use is known for the argument
def uri_for(use:)
case use
when RDF::URI
use
when :original_file
ORIGINAL_FILE
when :extracted_file
EXTRACTED_TEXT
when :thumbnail_file
THUMBNAIL
else
raise ArgumentError, "No PCDM use is recognized for #{use}"
end
end
module_function :uri_for
end

attribute :file_identifiers, ::Valkyrie::Types::Set # id of the file stored by the storage adapter
attribute :alternate_ids, Valkyrie::Types::Set.of(Valkyrie::Types::ID) # id of the Hydra::PCDM::File which holds metadata and the file in ActiveFedora
attribute :file_set_id, ::Valkyrie::Types::ID # id of parent file set resource

# all remaining attributes are on AF::File metadata_node unless otherwise noted
attribute :label, ::Valkyrie::Types::Set
attribute :original_filename, ::Valkyrie::Types::Set
attribute :mime_type, ::Valkyrie::Types::Set
attribute :use, ::Valkyrie::Types::Set # AF::File type
attribute :mime_type, ::Valkyrie::Types::String.default(GENERIC_MIME_TYPE)
attribute :type, ::Valkyrie::Types::Set.default([Use::ORIGINAL_FILE])
attribute :content, ::Valkyrie::Types::Set

# attributes set by fits
Expand Down Expand Up @@ -75,20 +107,25 @@ class FileMetadata < Valkyrie::Resource
def self.for(file:)
new(label: file.original_filename,
original_filename: file.original_filename,
mime_type: file.content_type,
use: file.try(:use) || [::Valkyrie::Vocab::PCDMUse.OriginalFile])
mime_type: file.content_type)
end

##
# @return [Boolean]
def original_file?
use.include?(::Valkyrie::Vocab::PCDMUse.OriginalFile)
type.include?(Use::ORIGINAL_FILE)
end

##
# @return [Boolean]
def thumbnail_file?
use.include?(::Valkyrie::Vocab::PCDMUse.ThumbnailImage)
type.include?(Use::THUMBNAIL)
end

##
# @return [Boolean]
def extracted_file?
use.include?(::Valkyrie::Vocab::PCDMUse.ExtractedImage)
type.include?(Use::EXTRACTED_TEXT)
end

def title
Expand All @@ -100,11 +137,11 @@ def download_id
end

def valid?
file.valid?(size: size.first, digests: { sha256: checksum.first.sha256 })
file.valid?(size: size.first, digests: { sha256: checksum&.first&.sha256 })
end

def file
::Valkyrie::StorageAdapter.find_by(id: file_identifiers.first)
Hyrax.storage_adapter.find_by(id: file_identifiers.first)
end
end
end
2 changes: 1 addition & 1 deletion app/models/job_io_wrapper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def to_file_metadata
Hyrax::FileMetadata.new(label: original_name,
original_filename: original_name,
mime_type: mime_type,
use: [Valkyrie::Vocab::PCDMUse.OriginalFile])
use: [Hyrax::FileMetadata::Use::ORIGINAL_FILE])
end

# The magic that switches *once* between local filepath and CarrierWave file
Expand Down
17 changes: 8 additions & 9 deletions app/search_builders/hyrax/collection_member_search_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,14 @@ def member_of_collection(solr_parameters)
solr_parameters[:fq] << "#{collection_membership_field}:#{collection.id}"
end

# This overrides the models in FilterByType
def models
case search_includes_models
when :collections
collection_classes
when :works
work_classes
else super # super includes both works and collections
private

def only_works?
search_includes_models == :works
end

def only_collections?
search_includes_models == :collections
end
end
end
end
Loading