-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding license information (or lack thereof) #336
Comments
I agree that standardising the values should not happen on standard (SSSOM) level, but on organisation level. Ideally, there would be a database of licenses with standard identifiers to use, but in the absence of that, a URL is the best thing we can do. They are mostly for human consumption, so I am happy to follow you way of thinking, although I know that folks like @cthoyt would disagree and would want to force a more rigorous standard for recording licenses (which we will implement on organisation level, rather than standard level). I am certainly not opposed to change the default license to We can make the new default license resolvable quickly. |
SPDX (https://bioregistry.io/spdx) is such a semantic space of standard licenses. You are right that I would like people who are using SSSOM to have to make a decision. SSSOM doesn't come from nowhere, so whoever makes it, either by manual or automated means, should be required to make a choice. I like it being in the standard, but people are used to not having to actually put a license (when using sssom-py) so there's also a social cost to enforcing better standards to consider. All that being said, the SSSOM spec says license is required, so having SSSOM-py stop skirting the spec also seems reasonable |
Well, there is. The SPDX license list and its short identifiers are precisely that. But I’d be opposed to using such identifiers in the At best, the SSSOM spec can suggest or even recommend that people use SPDX URLs (e.g.
I have no objection to that slot being required. The spec can mandate that any SSSOM writer MUST include a license (defaulting to a special value like `https://w3id.org/sssom/license/all-rights-reserved' if the user behind the software did not explicitly say which license he or she wanted to use); it can mandate that any SSSOM reader MUST complain upon parsing a set that does not have such a slot. My question is about forcefully injecting a default value when reading a set that does not have one (which is the current behaviour of the
You do not need the parser to auto-fill the slot with a default value for that. |
Hello. Just to let you know the most complete list of URI for License I found is https://rdflicense.linkeddata.es/ |
The license under which a mapping set is published is indicated using a
license
slot. That’s one of the few set-level metadata slot that is REQUIRED, and it must be ”a url to the license of the mapping”.This raises a couple of questions.
A single license can be identified by many URLs
Consider the four following URLs:
https://creativecommons.org/licenses/by/4.0/legalcode.en
https://creativecommons.org/licenses/by/4.0/
https://spdx.org/licenses/CC-BY-4.0.html
https://github.com/FlyBase/drosophila-anatomy-developmental-ontology/blob/master/LICENSE
They all point to the Creative Commons’ Attribution 4.0 license (aka “CC-BY-4.0”), so they all equally meet the requirement to be ”a URL to the license”. Yet the URL themselves are literally completely different, preventing any meaningful comparison (e.g., if we wanted to check that two mapping sets are published under the same license).
There are legitimate reasons to prefer any of those four URLs: the first one because it points to the project that created the license in the first place; the second one (without the
legalcode.en
) because it indirectly points to the same license and is the most commonly used form of the link; the third one because it points to a well-known website whose purpose is precisely to catalog the available licenses; and the fourth one because it points to the actual license file of the project that is publishing a mapping set.I don’t think there is much we can do about that.
So I think we should make clear to both implementers and users that the
license
slot is for human consumption only. That is, implementations SHOULD NOT try to automatically interpret the contents of that slot in any way, for example to decide whether it is OK to redistribute a given mapping set or whether two datasets are published under compatible licenses that allow them to be merged and to redistribute the merged set. Such questions should only ever be addressed by humans, after having read the license(s) pointed to by thelicense
slots.Recording the absence of license
As mentioned above, the
license
slot is required. Upon encountering a mapping set that does not have such a slot,sssom-py
automatically injects a fabricated slot with the valuehttps://w3id.org/sssom/license/unspecified
. That value is not mentioned anywhere in the spec or the documentation, and the URL does not resolve to anything.Do we want/need a unique special value to indicate “No license specified”?
I personally don’t see the need for such a value. The absence of the
license
slot should be enough. Upon encountering a mapping set withoutlicense
, implementations can either reject the set outright or proceed anyway (maybe after emitting a warning), but I don’t see what value is added by injecting a pseudo-value.If we do want a pseudo-value to indicate the absence of a license: Do we want it to be unique? That is, should all implementations inject the same pseudo-value (like
https://w3id.org/sssom/license/unspecified
)? Or should implementations be free to inject a value of their choosing, as long as it is a URL and that it conveys the fact that the license is unknown?If we do want such a unique value, then it should be specified somewhere in the spec, and ideally, it should be a URL that actually points to something. And instead of
https://w3id.org/sssom/license/unspecified
, I would like a URL that better highlights the fact that, in the absence of an explicit license, the mapping set must be assumed to fall under the normal copyright rules; so I’d suggest something likehttps://w3id.org/sssom/license/all-rights-reserved
instead.The text was updated successfully, but these errors were encountered: