Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a dataverse user, I would like to get info about access restrictions on file level through the API #4645

Closed
LauraHuisintveld opened this issue May 3, 2018 · 8 comments

Comments

@LauraHuisintveld
Copy link

I would like to retrieve information about access restrictions on file level with the API.
I do see the Terms of Use and the licence information in the JSON output on the dataset level, but I do not see any access information on file level. I would like to see if a specific file has restricted access or not.

See also this post on the Google Forum: https://groups.google.com/forum/#!msg/dataverse-community/QZdN4Uql06A/aDlj741NBAAJ

@pdurbin
Copy link
Member

pdurbin commented May 3, 2018

@LauraHuisintveld thanks for opening this issue and for linking back to the thread on the dataverse-community list. I added this issue to the list at #3440 but the more I think of it, the more I think I might need some clarification on which information you you want access to. An example might help. Let's look at https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/00875&version=2.0 and a file with an id of "2841767" within in.

Here's the file. Note that you see "Terms of Use" if you try to download it:

screen shot 2018-05-03 at 7 36 45 am

I was confused about this at first, but "Terms of Use" are defined once at the dataset level:

screen shot 2018-05-03 at 7 37 22 am

That is to say, all files have the same "Terms of Use" because they're defined in the parent, the dataset.

As you observed (and as I think I said wrong in my reply on the forum), you can retrieve "Terms of Use" via API like this (truncated example reformatted a bit):

curl -s 'https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=hdl:1902.1/00875' | jq '.data.latestVersion.termsOfUse' | cut -c-300

"<div style=\"padding-left: 30px;\"> <ul style=\"list-style-type: decimal;\" ><li> The Murray Archive (the Distributor) has granted me a revocable license to use this dataset solely for the purposes of conducting research, and the Distributor may terminate this license at any time and for any reason...

You can also retrieve the license ("NONE") in this case:

curl -s 'https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=hdl:1902.1/00875' | jq '.data.latestVersion.license'

"NONE"

And for an individual file (file "0" below, the first in the output), you can see if it's restricted or not:

curl -s 'https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=hdl:1902.1/00875' | jq '.data.latestVersion.files[0].restricted'

true

Can you please add a screenshot showing which information from the GUI you aren't able to retrieve via API? Thanks!

@raprasad
Copy link
Contributor

raprasad commented May 3, 2018

fyi: this can be combined with #4616, which is also about getting file-level information via the API

@scolapasta
Copy link
Contributor

Also, please note, this particular area (terms, access, restricted) is being revamped with the support for confidential, sensitive data. While it may make sense to address the particular request (whether a file is restricted or not), it will definitely be handled as part of 5.0.

@LauraHuisintveld
Copy link
Author

@scolapasta Good to hear this gets some attention in version 5.0.

@pdurbin : I will give you some more context about my question. One of our institutions that uses DataverseNL is using the API to get metadata from their datasets into their own Institutional repository. I am not sure how they do it, but I guess they do not want to make an API call for every single file to see if it is restricted. If you look at this output:

https://dataverse.nl/api/datasets/:persistentId/?persistentId=hdl:10411/IW15PI

you will get all metadata you might need, also info on file level. But what is missing here is information about if a file is restricted or not.

So yes, you can see if a file is restricted or not, but only with a specific call.

(Maybe this is related to a somewhat bigger issue. Within the same dataset, you can have Open Access files and restricted files, but you can only set Terms of Use/licence on the dataset level.
So in the Terms of Use/licence, the user will probably describe the conditions for access to the restricted files. But if there are also unrestricted files in the dataset, you can't tell that from the terms of Use.)

@LauraHuisintveld
Copy link
Author

Wait, I now see a difference between our output.
https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=doi:10.7910/DVN/TJCLKP

this does show if a file is restricted or not, whereas that specific line is missing for DataverseNL datasets. Your example:
screen shot 2018-05-04 at 10 17 01

My example:
screen shot 2018-05-04 at 10 18 19

We are currently on version 4.6.1, is this something that has changed in the higher versions?
(we are in preparation to upgrade to 4.8.6 soon!)

@pdurbin
Copy link
Member

pdurbin commented May 4, 2018

@LauraHuisintveld ah you're on 4.6.1. @ferrys added the ability to restrict files via API (and see if they are restricted) in fa3ca3f and perhaps other commits, which was pull request #3967 which landed in 4.8. See also #3873.

The bottom line is that if you upgrade to 4.8 or higher, your JSON output should be similar and should show that "restricted" boolean. Thanks for providing all this detail.

@LauraHuisintveld
Copy link
Author

I should have mentioned earlier which version we are using.. Thanks for the replies!
This issue will be solved for us when we upgrade, should I close it?

@pdurbin
Copy link
Member

pdurbin commented May 4, 2018

@LauraHuisintveld yes, I'll go ahead and close it. I linked to it from #3440 (with a strike through)in case other people are wondering about functionality that was formerly GUI only. And of course if you have any trouble after upgrading, please feel free to open a new issue. Thanks and please keep the feedback coming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants