Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Establish an ontology for maintaining URIs used in core Islandora vocabularies #2279

Open
mjordan opened this issue Jan 11, 2024 · 5 comments
Labels
Type: discussion Identifies a topic for conversation - may be similar to a question. Type: documentation provides documentation or asks for documentation.

Comments

@mjordan
Copy link
Contributor

mjordan commented Jan 11, 2024

Sorry if this issue is not strictly "Documentation", but I couldn't think of a better issue category to choose.

In Islandora's core vocabularies, namely Islandora Media Use, Islandora Models, and Islandora Display, all terms are assigned a unique URI in their "External URI" / field_external_uri field. These are used both within Islandora's code and also by external tools/systems such as Islandora Workbench. Workbench allows the use of a term's URI to get the term's internal, Drupal-instance-specific term ID, which vary across Drupal instances and that cannot be queried by external tools consistently across Drupal instances by any other means.

I would like to propose discussion around a community process for maintaining these URIs, for example for adding new terms and their accompanying URIs as new use cases and capabilities arise, and deprecating URIs that are no longer useful to us or that lose traction within the larger GLAM community. For example, this thread in Slack indicates that we do not have a universally accepted term URI for hOCR media in the Islandora Media Use vocabulary.

If we allow URIs for terms that are actionable (which includes all of the terms in the three vocabularies I name above) to diverge across Islandora instances, we will lose the value of shared Drupal code, shared external tools such as Workbench (and others that do not yet exist, such as reporting tools), cross-repository aggregation, and leveraging Linked Data.

Finally, I will point out that in Islandora 7, we maintained https://github.com/Islandora/islandora_ontology. In that context, most of the standardization in URIs (or lack of standarization) was hidden in RELS-* datastreams and defined by core solution packs, but because Islandora 2 replaces RELS-* with standard Drupal taxonomies, we need to mitigate drift in the low-level plumbing at the core of Islandora by ensuring that we use shared URIs for actionable vocabularies.

@mjordan mjordan added Type: documentation provides documentation or asks for documentation. Type: discussion Identifies a topic for conversation - may be similar to a question. labels Jan 11, 2024
@rosiel
Copy link
Member

rosiel commented Jan 11, 2024

We've done a lot of work making islandora work with whatever URIs you want. That is, the URIs are not in code, they are in config, and the resulting behaviour is also determined by config.

Are you proposing that we start to hardcode, in workbench or other places, behaviours that are currently configurable so that they are no longer configurable?

@mjordan
Copy link
Contributor Author

mjordan commented Jan 11, 2024

No, I am not proposing hardcoding anything. I am proposing that we have an open process for adopting (and documenting) new default URIs when needed so we can choose URIs that will work in a wide variety of contexts.

Within a specific Drupal instance, configuring a local or preferred URI is harmless. If a site admin wants to deviate from the default URI for some reason when the term means exactly the same thing to site A as it does to site B, that's on them (although I'd be interested in hearing the justification for someone changing a default URI for a term provided that the URI wasn't overtly offensive in some way).

Workbench can be configured (in most cases, I'd have to check to see if/where any have been literally hard coded) to use non-default URIs for terms, but Workbench is a tool we have complete control over. The real benefit of using standardized URIs really comes into play with external tools that we have less or no control over, and in repository aggregation, where the aggregators can assume that all (non-locally modified) Islandora instances use a given URI for a given term. I don't understand IIIF enough to provide a specific example, but I can imagine a researcher wanting to use a standard IIIF viewer to compare two versions of the same page, one of which is hosted at an Islandora repository. I doubt the researcher is going to bother to ask the Islandora repo's manager what the URI for the Media Use "transcript" term is.

Related to aggregation is Islandora repositories participating in the broader Linked Data world. But, it's totally possible that literally nobody really cares about that.

Edit: added "and documenting" to my second sentence.

@ruebot
Copy link
Member

ruebot commented Jan 11, 2024

There is this related issue too #1318

@whikloj
Copy link
Member

whikloj commented Jan 22, 2024

IMHO.

  1. Maintaining the ontology for any URIs generated by the islandora community (specifically any with an islandora.ca domain) is a must as most of the are (as Mark mentions) the "External URI" field. So they should be a resolvable URI.

  2. The idea of adding and deprecating URIs, seems a little odd to me. Mostly I'm unclear on how "we" (being the community) decide what "...lose traction within the larger GLAM community." But as long as we are just deprecating but continuing to make the URI resolvable, then this is more of a "why" than a "how" question. So it can be an evolving process.

  3. However, all code can only use these as default values and (I would suggest) in most cases (i.e. for a new module installed in an existing setup) we should look at existing configurations to find the appropriate values.

    In other words, no one should be expecting an URI defined in the Islandora ontology to exist anywhere in a repository but instead should be querying the appropriate configuration entities and/or requiring the setting of configuration for individual tools.

    This means you might have duplicate configuration settings but also allows for that separation and flexibility already built in.

These opinions are my own and only worth the paper they were written on. What's paper you ask? Oh children gather around and let me tell you of the joys of tractor-feed perforations.

@mjordan
Copy link
Contributor Author

mjordan commented Jan 24, 2024

@whikloj a plausible example of a term that is likely to "lose traction" for EDI reasons (and that we currently use as a default in the Media Use vocabulary) is http://pcdm.org/use#PreservationMasterFile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: discussion Identifies a topic for conversation - may be similar to a question. Type: documentation provides documentation or asks for documentation.
Projects
None yet
Development

No branches or pull requests

4 participants