Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content model objects in Islandora/F4 #31

Closed
mjordan opened this issue Apr 2, 2015 · 19 comments
Closed

Content model objects in Islandora/F4 #31

mjordan opened this issue Apr 2, 2015 · 19 comments

Comments

@mjordan
Copy link
Contributor

mjordan commented Apr 2, 2015

This is a followup to a discussion started during the March 27 meeting of the Fedora 4 Interest Group.

During the call, we discussed the impact of not using content model objects (as Islandora currently uses) but instead expressing collections as RDF properties on member objects. Some of the implications we identified are:

  • There would be no more islandora:root collection
  • Some people are using these to create inherited properties in new content-models
  • Content-models act as useful examples of the content type
  • How do we express collection policies (e.g., allowed Mimetypes. or sub-content models such as the current Paged Content within Books)?

Topic for discussion: is it preferable to model collections content models as full Fedora 4 objects or to express collection membership content models as RDF simple string properties. Examples of implementations to illustrate your arguments would be useful.

@DiegoPino
Copy link
Contributor

Sorry, i could not attend that meeting, but I'm not quite sure if using arbitrary values on e.g in rdf:type to denote membership to a cmodel would be the best idea in terms of how we traverse our relations tree(graph) in the future and how we make our objects and relations discoverable for linked data compliant applications.
Some quick questions first:

" but instead expressing collections as RDF properties on member objects." Do you mean expressing cmodel membership/classificactions as RDF properties? I see in nicks diagram rdf:type for this.

Looking at the PCM there are some basic definitions we could use and extend.

  • There is a definition for a collection object (pcdm:Collection) and a basic object (pcdm:Object), both subclasses of ore:Aggregation, so using the same ontology notion, cmodels could be described by subclassing pcdm:Collection or pcdm:Object (depends on how we wan't to model our repo).
    • If subclassing pcdm:Collection: lets name this isla:CModel subclass of pcdm:Collection we just give this new class (sub-sub class) some new rdf properties, including some enumerations, domains or even some owl:Restriction to limit what this class can handle/relate to in quantity and quality. And then we make every new object that should be part of this cmodel part of this (let's invent some new property or reuse an existing one of pcdm:Collection, the nice thing about subclassing).
    • Even better (in my limited perspective as i write), subclass pcdm:Object, lets name this isla:CModel again. So our CMODELS instances are just ordinary objects with some new additional properties. We can then add even files to it, new properties specific to our CMODEL definition (enumerations), etc, restrictions, etc. So new objects could be related to an instance(object) of isla:CModel or just be instances of it, depending on how complex we wan't this. We maintain also the Fedora4 basic hierarchy. This would also allow to make generic CMODELs and subclass when needed again. (lest say an Image cmodel class could describe in 90% an large image CMODEL, so for defining this one we subclass the first one and just add the new properties but don't redefine everything ).

In bot cases, rdf:type still applies, but to a real/ontology defined class. I though that was the idea behind pcdm, extend to suit our needs. I think CMODEL play an important role for developers(users don't even know whats behind) and also for making objects portable. If not present all is tied to our fronted.(with it benefits and drawbacks).

@mjordan
Copy link
Contributor Author

mjordan commented Apr 2, 2015

@DiegoPino, in response to your question

"but instead expressing collections as RDF properties on member objects." Do you mean expressing cmodel membership/classificactions as RDF properties? I see in nicks diagram rdf:type for this.

Yes, content models, not collections, sorry about that. I've edited the original post to reduce further confusion.

If I recall the discussion at the meeting, the proposal was to not represent content models as Fedora objects, but to define a content model as a simple string literal property of an Islandora object. If a content model was only identified by a literal string, it could not have properties of its own. @ruebot's example diagram uses the 'hasParent' predicate but now I see at https://github.com/Islandora-Labs/islandora/blob/7.x-2.x/docs/technical-documentation/migration.md that the proposed replacement is now 'rdf:Type'. Perhaps he can elaborate on that choice.

You make a strong argument for retaining content models as objects, especially as sub-classable objects. But, subclassing content models begs the question of modeling them in the first place as objects, e.g., do we have a implementation mechanism for extending or refining parent model classes as in your image/large image example).

@ruebot
Copy link
Member

ruebot commented Apr 2, 2015

tags @rosiel since she had some good thoughts and opinions on the call too.

islandora:root -- It looks like we should preserve this as Danny, Jared, and I continue working through the migration-utils. Rationale is the fcrepo4 tree structure. It is more efficient as a deep tree instead of wide tree. So, we're thinking of proposing an "islandora:root" for each Islandora (thinking of the multi-site use case).

As for content models, "proposal was to not represent content models as Fedora objects, but to define a content model as a simple string literal property of an Islandora object" that was just my initial inclination. But, after our discussion, I'm not sure this is the best way forward since we need to differentiate between the confusion around content models. That said, would on of y'all be willing to lay all this out in a use case? Then, I'd be more than happy to work on updating the basic model I have stared.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 2, 2015

@ruebot Does retaining islandora:root allow for content in the repo that is not a descendent of that collection (i.e., not a member of any Islandora collection)? That was a use case I brought up during the call.

@DiegoPino
Copy link
Contributor

I would love to give this a try next week, but i still have this question about where/when we are enforcing/checking/applying our ontology(base or an extended one), maybe inside the triple store?. This is a main topic in my opinion, because, as i wrote, it's possible to define every CMODEL as just a subclass definition(with it's own particularities) inside an ontology (describe in owl). This way a Image object could be an instance of a specific subclass of pcdm:Object , like, e.g isla:ImageObject . But if we are leaving the ontology just as a reference, then we can't enforce this properties and must hardcode on whichever side we choose the structure/properties/restrictions of an isla:ImageObject or do as we do now, create a object of type cmodel that glues everything.

Basicly my idea is to extend the pcdm ontology to something more particular. Every "solution pack" could then add a chunck of ontology to this base-islandora-extended one describing how should this new subclass of objects should be like.

About islandora:root. I agree. Fedora4 allows multiple type of relations, so it has it's base "tree" but also the multidimensional/flexible graph formed by relations between objects (hope so or i'm doomed!)

@ruebot
Copy link
Member

ruebot commented Apr 2, 2015

@mjordan I'm not sure what you mean. Do you mean in the Islandora context? Or, just throwing whatever you want to throw in fcrepo4?

@mjordan
Copy link
Contributor Author

mjordan commented Apr 2, 2015

@ruebot Throwing whatever I want into the same instance of fcrepo as the one that powers my Islandora sites.

@ruebot
Copy link
Member

ruebot commented Apr 2, 2015

@mjordan I don't see why not. That is at the fcrepo4 level, not at the Islandora level. You can do that now the with fcrepo3 and Islandora.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 2, 2015

@ruebot thought so but we did discuss this during the call, just confirming.

@DiegoPino
Copy link
Contributor

@mjordan, what @ruebot says is true. Even when islandora:root is in place, how your objects inter relate outside this functional definition is up to you.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 2, 2015

@DiegoPino WRT your main topic, that's what I was wondering about when I said "implementation mechanism for extending or refining..." Inheritance, subtyping, and object chaining would be very useful. Just thinking out loud, but since Islandora is implemented in PHP (because of Drupal), if we defined Islandora objects as PHP classes, could we map PHP's OO implementation to Islandora objects' implementation. So content-model-oriented solution packs would not use XML to define content models, they'd use PHP classes. Kinda getting off topic here...

@DiegoPino
Copy link
Contributor

@mjordan Not sure about how to implement/describe this in code, but sounds nice!. The logic behind Ontologies (owl/rdf) is 100% class oriented, pure sweet objects-class theory, but ontologies describe some hard to process definitions(needs some rules-system+reasoner implementation), like restrictions, domains, etc. But i'm still thinking about this implementation: describing CMODELs as owl/rdf and leave the logic of interpreting this definition to PHP (or camel?ja!). Think of a CMODEL as a traversable owl/rdf structure, a graph with some conditions that can be fetched and applied when building a new object or an interface on drupal. I'm already doing this on Fedora 3 for some pretty complex stuff, but it's a lot of code i'm sure nobody(except me) will we happy to maintain. But speaking for this model/approach: our structure wouldn't be obscure anymore for other systems and subclassing is simple. Just adding new owl files somewhere (still thinking where). So we could define (just) a way of understanding owl in PHP (oo oriented) and then build our local magic using drupal.

@rosiel
Copy link
Member

rosiel commented Apr 2, 2015

@DiegoPino That's exactly what I have in mind though i haven't implemented it yet. I love the idea of OWL ontologies for relationships between CMODELs, because they're so powerful and can define possible relationships (with those domain/ranges you mentioned). However, as in my use case the ontology i'm interested in is a generic ontology (i.e. it defines concepts like "Image" and "Performance" and "Expression"), it doesn't fully describe the kinds of data objects that I'll be using. So I'll need custom classes/cmodels that inherit from these generic classes (maybe defined in a custom owl ontology, put... somewhere?). I want my "content classes" to have their own metadata requirements/interfaces, and display code. Is that still the plan for F4?

@DiegoPino
Copy link
Contributor

@rosiel , your needs are in my opinion a perfect use case of ontologies overlapping and/or subclassing. The big question is how we extend our base to have a base definition for our data and where/how do we allow such alternative ways of visualising/relating the same data to happen. If we use pcdm as base, as we are planning to do, and having every object being an subclass of ore:Agreggation, then this is a must read: http://www.openarchives.org/ore/1.0/datamodel#ReM-to-aggr.

I'm still trying to fully understand how the "resource map" comes in place in F4, if it's hardcoded or just a theoretical concept? (help!). I can also fully imagine a case where we allow multiple ontologies to exist. So, at the islandora side/official, we start by extending our official pcdm by deriving classes and adding simple ontologies that extend to CMODEL concept, but we also allow to "classify" objects inside a different semantic world with a whole different definition that at some point (like a directed property) converge to the base one. The "put… somewhere" question is my concern also, and also "the how we process/query these". Your use case can also be modelled using pcdm, and derived from the all-permiting and flexible pcdm:object definition by describing a very generic cmodel and then subclassing again? Having rdf and owl statements gives us such a flexibility. Mmm, maybe we should start drawing some diagrams...

@whikloj
Copy link
Member

whikloj commented Apr 3, 2015

I may have missed the point (it happens sometimes) but my understanding was that we would not be getting rid of the core content-models, but perhaps we wouldn't have to install them into ever single Islandora repository.

This would require some different thinking to maintain these objects in a central place, but for those of us that don't alter the core content models it makes perfect sense (especially if they are not referenced continuously). It would also not stop anyone from extending an existing Islandora content model for their own needs.

@DiegoPino
Copy link
Contributor

I'm a bit confused and still thinking a lot of what would be the best way of implement this, in terms of

  • Less code
  • Flexibility
  • Usability
  • Optimal runtime

The problem (in my humble opinion) about having the CMODEL definitions in some central place @whikloj is that, somehow we will have to ask for it's definition when someone needs to ingest an Object, if we are enforcing that definition of course. We could also preprocess that info once and store it in some kind of drupal structure (cache, db, whatever), but still, we would need to have "someone" serve the data for us,at least once, that means maybe many requests -> that means resources, uptime, etc = money.
What i would like to investigate is how we can store (triplestore) that info once in every repo on "solution pack install", if we still have that. OWL files are RDF's and can be queried via Sparql or directly using a nice php api like Graphite. Protege has a Sparql for OWL interface that can be used to play a bit and make some tests. This could allow an intermediate solution, no need for "objects" that define CMODELs, but still locally available. How is Hydra managing this?

Back to basics: i think we can discuss this and others options on the next meeting, even the possibility to going back to a simple enumerated data property.

@daniel-dgi
Copy link
Contributor

We can always have centrally hosted versions but cache them locally and have cron check for changes once a day. That way your app doesn't break if the central server goes down. We should do the same for LOC stuff. Remember when the US Gov't shut down?

@ruebot ruebot added the modeling label Apr 7, 2016
@ruebot
Copy link
Member

ruebot commented Apr 7, 2016

See also #179

@dannylamb
Copy link
Contributor

Closing old issues. Will be bringing up OWL ontologies and our usage of them in newer ticket with an MVP context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants