Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: enable static OAI-PMH sets for the collection tree #9344

Open
poikilotherm opened this issue Jan 30, 2023 · 1 comment
Open

Comments

@poikilotherm
Copy link
Contributor

Overview of the Feature Request

When harvesting is enabled, all collections should be available for harvesting without creating a search first. This was discussed on the Google Group: https://groups.google.com/d/msgid/dataverse-community/e9ff1e51-ea37-4c8f-8fcc-3fb22b4dc886n%40googlegroups.com with @philippconzett @jggautier etc

The idea is to use the OAI-PMH set spec to express what collection is requested.

a setSpec -- a colon [:] separated list indicating the path from the root of the set hierarchy to the respective node. Each element in the list is a string consisting of any valid URI unreserved characters, which must not contain any colons [:]. Since a setSpec forms a unique identifier for the set within the repository, it must be unique for each set. Flat set organizations have only sets with setSpec that do not contain any colons [:].

First inspiration: create a set hierarchy starting with collections: being the root Dataverse collection and digging your path down the tree by separating with ":". Example: "collections:topcoll:subcoll:subsubcoll" Expressing the set with such a spec meaning results comprise only from datasets of the specified node (collection), no children. (Maybe include linked datasets).
If you want to go for a full subtree in a set, create a set as usual using the search function.

Second thoughts: as Dataverse is comprised of unique collection names and the "hierarchy" is just UI eyecandy, one might as well make all avail as "collections:xxx" as set spec. And then play around with sth. like "collections:xxx:all" to express the intent to include all subcollections in the response, too.

What kind of user is the feature intended for?
API User, Curator, Depositor

What inspired the request?
https://groups.google.com/d/msgid/dataverse-community/e9ff1e51-ea37-4c8f-8fcc-3fb22b4dc886n%40googlegroups.com

What existing behavior do you want changed?
There are no standard, default sets, but they might be useful.

Any brand new behavior do you want to add to Dataverse?
No, this is just an extension of what's already there.

Any related open or closed issues to this feature request?
This might be a part of IQSS/dataverse-pm#25

@pdurbin
Copy link
Member

pdurbin commented Jan 30, 2023

In the past I've suggested the collections could have hierarchy in the path: #235 (comment)

For example:

However, we seem pretty committed to having the collection alias be unique. "bar" and "baz" above are not unique.

So I assume we'd go with "collections:xxx" rather than "collections:topcoll:subcoll:subsubcoll".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ⚠️ Needed/Important
Development

No branches or pull requests

2 participants