Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft a GenreFacetType ontology collecting most of PRONOM, NFO, UDFR #2

Closed
wants to merge 10 commits into from

Conversation

barmintor
Copy link
Contributor

No description provided.

@awoods
Copy link
Member

awoods commented Apr 10, 2015

We should establish a set of initial "committers" for this PCDM Git-repository. Otherwise, what is the vetting approach for updates/additions such as this?

@acoburn
Copy link
Contributor

acoburn commented Apr 12, 2015

@awoods regarding committers for this, I'm inclined to cast a pretty wide net, at least initially.

As for this particular contribution, I find it to be remarkably similar to an ontology that I maintain for our local repository https://github.com/AmherstCollege/acdc-ontology/blob/master/rdf/objectTypes.rdf (we use skos:mappingRelation instead of owl:sameAs, and we're more focused on Getty AAT, LOC, schema.org and opengraph). Either way, I'd be supportive of something like this, especially since I'd much rather rely on a community supported mapping than maintain our own.

/cc @escowles

@barmintor
Copy link
Contributor Author

@acoburn interesting to see some little differences that I am guessing emerge from arriving from a PRONOM/digipres vector vs Getty/cataloguing one. Do you agree that there's value in preserving the narrower scope of this document? If skos relations are a better description of the lineage here, do you want to add a commit to that branch with the change? I'd like the commit history to give appropriate credt for authorship!

@acoburn
Copy link
Contributor

acoburn commented Apr 13, 2015

@barmintor I prefer skos, because it doesn't imply the strong assertions of identity that accompany owl:sameAs.

@awoods
Copy link
Member

awoods commented Apr 13, 2015

On a side note, what do you think about a "call for committers" message to the lists?
@escowles, @acoburn, @barmintor ??

Maybe taking nominations for 5 people?

@acoburn
Copy link
Contributor

acoburn commented Apr 13, 2015

That sounds like a great idea!

@barmintor
Copy link
Contributor Author

This draft should also include:

  • mappings to LoC and Getty concepts where relevant
  • guidance on usage of the terms

<rdfs:subClassOf rdf:resource="Document" />
<skos:exactMatch rdf:resource="http://reference.data.gov.uk/technical-registry/Dataset" />
<skos:exactMatch rdf:resource="http://www.udfr.org/onto#DatasetGenre" />
<skos:exactMatch rdf:resource="http://purl.org/dc/elements/1.1/Dataset" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be http://purl.org/dc/dcmitype/Dataset, right? Same for other DCMI Type terms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct -- those should be DCMI Types

@azaroth42
Copy link
Contributor

Can we keep contributions separate from the core in a different directory structure somehow?

@acoburn
Copy link
Contributor

acoburn commented Apr 17, 2015

such as a ./pcdm-core directory and a ./pcdm-ext directory?

@azaroth42
Copy link
Contributor

That sounds good to me. My concern is confusing adopters with what counts as PCDM (which at the moment is not very much) and what is suggested extensions that could be adopted or ignored without peril.

@escowles
Copy link
Contributor

👍 to a pcdm-ext directory to hold non-core stuff. I agree it makes sense to make it clear what's core PCDM and what's more peripheral. The file use vocab could potentially go in the ext directory, too.

@mjgiarlo
Copy link

👍 and I've also seen contrib used similarly but w/e.

@awoods
Copy link
Member

awoods commented Apr 17, 2015

👍 ./pcdm-ext for the extras. Should the core be at the top-level or in its own directory?

@acoburn
Copy link
Contributor

acoburn commented Apr 17, 2015

Putting models.rdf into a ./pcdm-core directory would give (possibly) a stronger indication to someone browsing github that the core models are located there. Keeping it at the root level does so implicitly, and I'm fine with that if you think that's sufficiently clear.

@awoods
Copy link
Member

awoods commented Apr 17, 2015

There is an interesting analogy with what is being discussed on the Fedora side (ontology), questioning where the core ontology goes and where the extra ontologies go. So far, the discussion has been leaning towards a separate Git repository for the extras. Maybe it is simpler, however, to just split them into directories.
In any case, consistency across these efforts would be a treat.

@ruebot
Copy link
Contributor

ruebot commented Jun 4, 2015

Should we come back to this one, and vote/merge it?

@escowles
Copy link
Contributor

escowles commented Jun 4, 2015

Yes, I think we should resolve this -- I've created a PR to add a new OfficeDocument supertype of Presentation, Spreadsheet and a newly-added WordProcessingDocument: barmintor#6

@awoods
Copy link
Member

awoods commented Jun 5, 2015

Can someone champion this PR to bring it to resolution?

</udfrs:GenreFacetType>
<udfrs:GenreFacetType rdf:ID="Spreadsheet">
<rdfs:isDefinedBy rdf:resource="http://pcdm.org/file-format-types#"/>
<rdfs:subClassOf rdf:resource="Document" />
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more precise for this to be a subclass of Dataset or Database?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Spreadsheet could be a subclass of Dataset, as well as a subclass of an OfficeDocument type I introduced in barmintor#6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a little suspicious of the OfficeDocument superclass: It seems
like it reifies a marketing strategy, doesn't it?

On Fri, Jun 5, 2015 at 11:56 AM, Esmé Cowles [email protected]
wrote:

In file-format-types.rdf
#2 (comment):

I think Spreadsheet could be a subclass of Dataset, as well as a subclass
of an OfficeDocument type I introduced in barmintor#6
barmintor#6


Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31825169.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@escowles According to your comment: https://github.com/duraspace/pcdm/pull/2/files#r31825042 Database seems a more appropriate superclass, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think OfficeDocument is just a marketing strategy -- it's also a functional category of documents: the kind of documents that typical users are likely to produce. But I'm fine with deferring that for now, since I haven't heard anyone besides me support that.

Though I do think that, separate from the OfficeDocument issue, the WordProcessingDocument is a real missing type: https://github.com/barmintor/pcdm/pull/6/files#diff-c8440ea0bf1d35562d43e1323c36f30dR157

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a mistake to drift from the goal of making a
union/reconciliation set of these vocabularies to trying to sort out an
ontological hierarchy. We are going to spend a lot of time arguing about,
eg, how a MS-Word document is really different from a PDF, and in the end
it is going to chafe against cataloging concerns.

On Fri, Jun 5, 2015 at 12:06 PM, Stefano Cossu [email protected]
wrote:

In file-format-types.rdf
#2 (comment):

@escowles https://github.com/escowles According to your comment:
https://github.com/duraspace/pcdm/pull/2/files#r31825042 Database seems a
more appropriate superclass, no?


Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31826088.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent of PageDescription's source classes is to encompass
texts produced by layout/word-processing programs. I wasn't able to
satisfactorily defend a distinction to our metadata librarian, anyway.

On Fri, Jun 5, 2015 at 12:12 PM, Esmé Cowles [email protected]
wrote:

In file-format-types.rdf
#2 (comment):

I don't think OfficeDocument is just a marketing strategy -- it's also a
functional category of documents: the kind of documents that typical users
are likely to produce. But I'm fine with deferring that for now, since I
haven't heard anyone besides me support that.

Though I do think that, separate from the OfficeDocument issue, the
WordProcessingDocument is a real missing type:
https://github.com/barmintor/pcdm/pull/6/files#diff-c8440ea0bf1d35562d43e1323c36f30dR157


Reply to this email directly or view it on GitHub
https://github.com/duraspace/pcdm/pull/2/files#r31826593.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's definitely true of nfo:PaginatedTextDocument -- but I don't agree about pronom:PageDescription or udfr:PageDescriptionGenre. Pronom has a separate WordprocessedText class, which seems to pretty clearly indicate that it's separate. There's not a lot to go on for UDFR since there really aren't any descriptions, examples or relations defined, but my reading of udfr:DocumentGenre is that it's more like pronom:WordprocessedText, too (distinct from the other genres, not a superclass of them).

@ruebot
Copy link
Contributor

ruebot commented Jul 17, 2015

So, we just need to resolve this before we can merge this? Am I understanding that correctly?

@escowles
Copy link
Contributor

This has dragged on too long, so I've closed barmintor#6 and I'm 👍 on merging this as-is. Moving the file to pcdm-ext/file-format-types.rdf would be a good improvement, though.

@barmintor
Copy link
Contributor Author

Esme how would you feel about breaking the file refactoring and office type
into 1 pr and the inheritance and word processing type changes into another
(or two)? I feel like we're throwing out two uncontested improvements with
the bathwater that wants more discussion.
On Jul 24, 2015 8:21 AM, "Esmé Cowles" [email protected] wrote:

This has dragged on too long, so I've closed barmintor#6
barmintor#6 and I'm [image: 👍] on
merging this as-is. Moving the file to pcdm-ext/file-format-types.rdf would
be a good improvement, though.


Reply to this email directly or view it on GitHub
#2 (comment).

@escowles
Copy link
Contributor

Ben, I've opened barmintor#7 for just the move, since that's the thing that complicates discussing other changes. Once that's merged, I can open a couple of other PRs for the other changes.

Moving file-format-types.rdf to pcdm-ext directory
@acoburn
Copy link
Contributor

acoburn commented Sep 9, 2015

What is the status of this PR?

@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

I think we're good to vote on it now since we have @escowles PR merged into @barmintor's PR.

@awoods
Copy link
Member

awoods commented Sep 9, 2015

I think a thumbs-up from @escowles and @barmintor would seal the deal.

@azaroth42
Copy link
Contributor

Agree, I think all other objections were resolved with pcdm-ext.

@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

...having yet to vote on this. 👍

@escowles
Copy link
Contributor

escowles commented Sep 9, 2015

👍

1 similar comment
@barmintor
Copy link
Contributor Author

+1

@awoods
Copy link
Member

awoods commented Sep 9, 2015

This is ready to go. I will squash and commit it unless someone else is already on it.

@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

@awoods all you 😄

@awoods
Copy link
Member

awoods commented Sep 9, 2015

Resolved with: c6b4cd7

@awoods awoods closed this Sep 9, 2015
@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

@awoods shall we update #24?

@awoods
Copy link
Member

awoods commented Sep 9, 2015

Sounds like a great idea, @ruebot. If you create the xsl, I will push it to pcdm.org... along with file-format-types.rdf.

@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

@awoods cool. i'll see if i can get that done by end of day.

@ruebot
Copy link
Contributor

ruebot commented Sep 9, 2015

...and I just noticed something as I was working on the stylesheet. This owl, all the other ones are rdfs. Should we make this rdfs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants