Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial drafting of Storage Hierarchies section #180

Merged
merged 3 commits into from
Oct 2, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 41 additions & 1 deletion draft/spec/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,14 @@
"OCFL-Implementation-Notes": {
title: "OCFL Implementation Notes",
href: "../implementation-notes"
},
"PairTree": {
title: "Pairtrees for Object Storage",
href: "https://confluence.ucop.edu/display/Curation/PairTree",
authors: [
"J. Kunze", "M. Haye", "E. Hetzner", "M. Reyes", "C. Snavely"
],
date: "12 August 2008"
}
}
};
Expand Down Expand Up @@ -718,7 +726,7 @@ <h2>Root Structure</h2>
</p>
<p>
<a>OCFL Object</a>s within a given OCFL Storage Root MUST be the same as the OCFL specification
version as declared in the <a href="#root-conformance-declaration">Root Conformance Declaration</a>.
version as declared in the Root Conformance Declaration.
</p>
<p>
An OCFL Storage Root SHOULD also contain the OCFL specification in human-readable plain-text format
Expand Down Expand Up @@ -746,6 +754,38 @@ <h2>Root Structure</h2>

<section id="root-hierarchies">
<h2>Storage Hierarchies</h2>
<p>
<a>OCFL Object Root</a>s MUST be stored as direct children of a containing <a>OCFL Storage Root</a> or as the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest "stored either as" to make it clearer that the MUST applies to either case. I'd be tempted to reverse the order as the directory hierarchy seems by far the more common use case

terminal resource at the end of a directory storage hierarchy.
</p>
<p>
A common practice is to use a unique identifier scheme to compose this storage hierarchy, typically arranged
according to some form of the [[PairTree]] specification. Irrespective of the pattern chosen for the storage
hierarchies, the following restrictions apply:
</p>
<ol>
<li>Storage hierarchies MUST NOT include files within interim directories</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this bullet means

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps interim was intended to be intermediate but even then I'm not sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to mean that if you have a directory hierarchy, there can not be files (or OCFL Objects, for that matter) interspersed throughout that hierarchy.

<li>Storage hierarchies MUST be terminated by OCFL Object Roots</li>
Copy link
Contributor

@zimeon zimeon Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So an empty leaf directory in a storage hierarchy, or an empty storage root is illegal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty leaf directory -> invalid
empty storage root -> potentially valid (Is it a "Storage hierarchy" with no OCFL Objects?)

<li>Storage hierarchies within the same OCFL Storage Root SHOULD share the same layout pattern</li>
<li>Storage hierarchies within the same OCFL Storage Root SHOULD NOT include mixed dispositions (zip,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite sure what the difference between "SHOULD share the same layout pattern" and "SHOULD NOT include mixed dispositions" is. The only addition in the second seems to be ZIP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference in these bullets may be subtle, but testable.

Storage hierarchies within the same OCFL Storage Root SHOULD share the same layout pattern

This is intended to mean that if a directory hierarchy is used, the pattern is consistent.

Storage hierarchies within the same OCFL Storage Root SHOULD NOT include mixed dispositions

This is intended to mean that it is invalid to intermix directory hierarchies, top-level Objects and zips.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see the point but did not understand that from the text. We should not use different terms "directory hierarchy" and "disposition" to mean pretty much the same thing. I propose:

  • The storage hierarchy under an OCFL Storage Root SHOULD use just one layout pattern.
  • The storage hierarchy under an OCFL Storage Root SHOULD consistently use either a directory hierarchy of OCFL objects, top-level OCFL objects, or OCFL Object ZIPs.

(and I think we also need a section describing in more detail what we mean by an OCFL Object ZIP... currently mentioned only in definition of OCFL Object Root... #181)

PairTree, top-level objects, etc)
</li>
</ol>
<p>
Additionally, the directory name (or filename minus the extension for zip files) of an OCFL Object Root MUST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this should be a MUST, I would be OK with SHOULD. It only makes sense when combined with specification of a character encoding scheme, It might preclude long identifiers, and is certainly not necessary with various regular layouts. Even the PairTree spec admits both encapsulated and non-encapsulated forms

correspond to the OCFL Object's full and complete identifier. To illustrate, for an OCFL Object with the
identifier <code>abcd123g</code>:
</p>
<ul>
<li>
Invalid:
<pre>/ab/cd/12/3g/... object files here</pre>
</li>
<li>
Valid:
<pre>/ab/cd/12/3g/abcd123g/... object files here</pre>
</li>
</ul>
</section>

<section id="root-extensions">
Expand Down