-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New HealthDCAT-AP profile #326
base: master
Are you sure you want to change the base?
Conversation
Rdflib 7.1 introduced changes in [date parsing][1] that made some base profile tests fail. Basically the previous rdflib versions incomplete dates like <time:inXSDDateTimeStamp rdf:datatype="http://www.w3.org/2001/XMLSchema#date"> 1904 </time:inXSDDateTimeStamp> were expanded to `1904-01-01`. Of course this is not a valid date and should be expressed using `gYear`: <time:inXSDDateTimeStamp rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"> 1904 </time:inXSDDateTimeStamp> and we should be expecting `1904`. This should play nice with the time properties we are generating in CKAN as they already handle automatically `gYear`, `gYearMonth`, `date` and `dateTime`. Sites importing external DCAT representations that use the wrong encoding might need to check their parsers. [1] RDFLib/rdflib#2929
@amercader as promised. There is still very heavy development going on of the HealthDCAT-AP specs but I feel like this is a really good starting point. My last week at Health-RI is the week of 16-20 december, it'd be great if the major comments could come before that. Any comments after that my colleagues will pick up (@hcvdwerf and @kburger). Thanks again for your time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reference = Graph() | ||
reference.parse(data=contents, format="turtle") | ||
|
||
print(s.g.serialize(format="turtle")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(s.g.serialize(format="turtle")) |
@@ -0,0 +1,254 @@ | |||
"""Test document""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not any more I guess
except (ValueError, TypeError): | ||
self.g.add((dataset_ref, predicate, Literal(value))) | ||
|
||
def _add_timeframe_triple(self, dataset_dict, dataset_ref): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the same logic as the standard temporal_coverage
handling. If there's no change is best to delete the method to avoid duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same logic (copy-pasted the lines you're mentioning), moved to a separate function instead of inline. I could also split it off to a separate function in that file to avoid duplication?
if agents: | ||
dataset_dict["hdab"] = agents | ||
|
||
# Add any qualifiedRelations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the qualified relationships handling (parsing and serializing), and AFAICT there is nothing health specific to it and could be moved to one of the base DCAT AP profiles right? (euro_dcat_ap_scheming.py would be the right place I think as this was added in DCAT 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing health specific indeed, it's DCAT 2+. I will move it to the generic scheming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it to the generic class, but did not have time anymore to add it to the test cases, sorry! It's still covered by HealthDCAT-AP test cases.
ckanext/dcat/tests/profiles/health_dcat_ap/test_euro_health_dcat_ap_profile_parse.py
Outdated
Show resolved
Hide resolved
ckanext/dcat/tests/profiles/health_dcat_ap/test_euro_health_dcat_ap_profile_parse.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Adrià Mercader <[email protected]>
Co-authored-by: Adrià Mercader <[email protected]>
Thanks for your insightful comments, Adrià! I will let you know when they're all addressed and ready for re-review. |
This PR adds a new profile class for the proposed HealthDCAT application profile. Closes #315
All mandatory properties are supported. Most recommended and optional properties are supported, too, with a few caveats (mainly related to unclarities of the specification).
There's a few minor issues still to sort out. Mainly, in some example data it seemed that geographical coverage (aka
dcat:spatial
) was not properly harvested, but that could also be an upstream issue.Mandatory information regarding funding: This project is co-funded by the European Union under Grant Agreement 101100633 (EUCAIM). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them. Funding was also provided by the Dutch National Growth Fund.