-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft a GenreFacetType ontology collecting most of PRONOM, NFO, UDFR #2
Conversation
We should establish a set of initial "committers" for this PCDM Git-repository. Otherwise, what is the vetting approach for updates/additions such as this? |
@awoods regarding committers for this, I'm inclined to cast a pretty wide net, at least initially. As for this particular contribution, I find it to be remarkably similar to an ontology that I maintain for our local repository https://github.com/AmherstCollege/acdc-ontology/blob/master/rdf/objectTypes.rdf (we use skos:mappingRelation instead of owl:sameAs, and we're more focused on Getty AAT, LOC, schema.org and opengraph). Either way, I'd be supportive of something like this, especially since I'd much rather rely on a community supported mapping than maintain our own. /cc @escowles |
@acoburn interesting to see some little differences that I am guessing emerge from arriving from a PRONOM/digipres vector vs Getty/cataloguing one. Do you agree that there's value in preserving the narrower scope of this document? If skos relations are a better description of the lineage here, do you want to add a commit to that branch with the change? I'd like the commit history to give appropriate credt for authorship! |
@barmintor I prefer skos, because it doesn't imply the strong assertions of identity that accompany owl:sameAs. |
On a side note, what do you think about a "call for committers" message to the lists? Maybe taking nominations for 5 people? |
That sounds like a great idea! |
This draft should also include:
|
<rdfs:subClassOf rdf:resource="Document" /> | ||
<skos:exactMatch rdf:resource="http://reference.data.gov.uk/technical-registry/Dataset" /> | ||
<skos:exactMatch rdf:resource="http://www.udfr.org/onto#DatasetGenre" /> | ||
<skos:exactMatch rdf:resource="http://purl.org/dc/elements/1.1/Dataset" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be http://purl.org/dc/dcmitype/Dataset, right? Same for other DCMI Type terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct -- those should be DCMI Types
Can we keep contributions separate from the core in a different directory structure somehow? |
such as a |
That sounds good to me. My concern is confusing adopters with what counts as PCDM (which at the moment is not very much) and what is suggested extensions that could be adopted or ignored without peril. |
👍 to a pcdm-ext directory to hold non-core stuff. I agree it makes sense to make it clear what's core PCDM and what's more peripheral. The file use vocab could potentially go in the ext directory, too. |
👍 and I've also seen |
👍 |
Putting |
There is an interesting analogy with what is being discussed on the Fedora side (ontology), questioning where the |
changed remaining owl:sameAs to skos:*Match
added loc and getty concepts
changed dc to dcmi for formatTypes
Should we come back to this one, and vote/merge it? |
Yes, I think we should resolve this -- I've created a PR to add a new OfficeDocument supertype of Presentation, Spreadsheet and a newly-added WordProcessingDocument: barmintor#6 |
Can someone champion this PR to bring it to resolution? |
</udfrs:GenreFacetType> | ||
<udfrs:GenreFacetType rdf:ID="Spreadsheet"> | ||
<rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/> | ||
<rdfs:subClassOf rdf:resource="Document" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more precise for this to be a subclass of Dataset or Database?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Spreadsheet could be a subclass of Dataset, as well as a subclass of an OfficeDocument type I introduced in barmintor#6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a little suspicious of the OfficeDocument superclass: It seems
like it reifies a marketing strategy, doesn't it?
On Fri, Jun 5, 2015 at 11:56 AM, Esmé Cowles [email protected]
wrote:
In file-format-types.rdf
#2 (comment):
- <udfrs:GenreFacetType rdf:ID="HTML">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Markup" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#HTMLDocument" />
- <skos:exactMatch rdf:resource="http://vocab.getty.edu/aat/300266021" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Presentation">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Presentation" />
- <skos:exactMatch rdf:resource="http://reference.data.gov.uk/technical-registry/Presentation" />
- <skos:exactMatch rdf:resource="http://www.udfr.org/onto#PresentationGenre" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Spreadsheet">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
I think Spreadsheet could be a subclass of Dataset, as well as a subclass
of an OfficeDocument type I introduced in barmintor#6
barmintor#6—
Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31825169.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@escowles According to your comment: https://github.com/duraspace/pcdm/pull/2/files#r31825042 Database seems a more appropriate superclass, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think OfficeDocument is just a marketing strategy -- it's also a functional category of documents: the kind of documents that typical users are likely to produce. But I'm fine with deferring that for now, since I haven't heard anyone besides me support that.
Though I do think that, separate from the OfficeDocument issue, the WordProcessingDocument is a real missing type: https://github.com/barmintor/pcdm/pull/6/files#diff-c8440ea0bf1d35562d43e1323c36f30dR157
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a mistake to drift from the goal of making a
union/reconciliation set of these vocabularies to trying to sort out an
ontological hierarchy. We are going to spend a lot of time arguing about,
eg, how a MS-Word document is really different from a PDF, and in the end
it is going to chafe against cataloging concerns.
On Fri, Jun 5, 2015 at 12:06 PM, Stefano Cossu [email protected]
wrote:
In file-format-types.rdf
#2 (comment):
- <udfrs:GenreFacetType rdf:ID="HTML">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Markup" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#HTMLDocument" />
- <skos:exactMatch rdf:resource="http://vocab.getty.edu/aat/300266021" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Presentation">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Presentation" />
- <skos:exactMatch rdf:resource="http://reference.data.gov.uk/technical-registry/Presentation" />
- <skos:exactMatch rdf:resource="http://www.udfr.org/onto#PresentationGenre" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Spreadsheet">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
@escowles https://github.com/escowles According to your comment:
https://github.com/duraspace/pcdm/pull/2/files#r31825042 Database seems a
more appropriate superclass, no?—
Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31826088.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent of PageDescription's source classes is to encompass
texts produced by layout/word-processing programs. I wasn't able to
satisfactorily defend a distinction to our metadata librarian, anyway.
On Fri, Jun 5, 2015 at 12:12 PM, Esmé Cowles [email protected]
wrote:
In file-format-types.rdf
#2 (comment):
- <udfrs:GenreFacetType rdf:ID="HTML">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Markup" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#HTMLDocument" />
- <skos:exactMatch rdf:resource="http://vocab.getty.edu/aat/300266021" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Presentation">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
- <skos:exactMatch rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Presentation" />
- <skos:exactMatch rdf:resource="http://reference.data.gov.uk/technical-registry/Presentation" />
- <skos:exactMatch rdf:resource="http://www.udfr.org/onto#PresentationGenre" />
- /udfrs:GenreFacetType
- <udfrs:GenreFacetType rdf:ID="Spreadsheet">
- <rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
- <rdfs:subClassOf rdf:resource="Document" />
I don't think OfficeDocument is just a marketing strategy -- it's also a
functional category of documents: the kind of documents that typical users
are likely to produce. But I'm fine with deferring that for now, since I
haven't heard anyone besides me support that.Though I do think that, separate from the OfficeDocument issue, the
WordProcessingDocument is a real missing type:
https://github.com/barmintor/pcdm/pull/6/files#diff-c8440ea0bf1d35562d43e1323c36f30dR157—
Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31826593.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's definitely true of nfo:PaginatedTextDocument -- but I don't agree about pronom:PageDescription or udfr:PageDescriptionGenre. Pronom has a separate WordprocessedText class, which seems to pretty clearly indicate that it's separate. There's not a lot to go on for UDFR since there really aren't any descriptions, examples or relations defined, but my reading of udfr:DocumentGenre is that it's more like pronom:WordprocessedText, too (distinct from the other genres, not a superclass of them).
So, we just need to resolve this before we can merge this? Am I understanding that correctly? |
This has dragged on too long, so I've closed barmintor#6 and I'm 👍 on merging this as-is. Moving the file to pcdm-ext/file-format-types.rdf would be a good improvement, though. |
Esme how would you feel about breaking the file refactoring and office type
|
Ben, I've opened barmintor#7 for just the move, since that's the thing that complicates discussing other changes. Once that's merged, I can open a couple of other PRs for the other changes. |
Moving file-format-types.rdf to pcdm-ext directory
What is the status of this PR? |
I think we're good to vote on it now since we have @escowles PR merged into @barmintor's PR. |
I think a thumbs-up from @escowles and @barmintor would seal the deal. |
Agree, I think all other objections were resolved with pcdm-ext. |
...having yet to vote on this. 👍 |
👍 |
1 similar comment
+1 |
This is ready to go. I will squash and commit it unless someone else is already on it. |
@awoods all you 😄 |
Resolved with: c6b4cd7 |
Sounds like a great idea, @ruebot. If you create the xsl, I will push it to pcdm.org... along with |
@awoods cool. i'll see if i can get that done by end of day. |
...and I just noticed something as I was working on the stylesheet. This owl, all the other ones are rdfs. Should we make this rdfs? |
No description provided.