Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SoDaNet endpoint details #277

Closed
cessda-bitbucket-importer opened this issue Feb 9, 2021 · 24 comments
Closed

Update SoDaNet endpoint details #277

cessda-bitbucket-importer opened this issue Feb 9, 2021 · 24 comments

Comments

@cessda-bitbucket-importer

Original report on BitBucket by John Shepherdson (GitHub: john-shepherdson).


From: Apostolos Linardis
Date: Fri, 5 Feb 2021 at 18:19
Subject: Sodanet OAI Dataverse Endpoint
To: John Shepherdson
Cc: konstantinos alexandris, Nikos Klironomos

Dear John,

I hope this email finds you well.
In Sodanet, we are in the process of moving our oai-pmh server from nesstar to dataverse. We are not ready yet, but we need to make some tests with our new oai endpoint in order to figure out any possible problems. For this reason, we need our new pilot records to be harvested by Cessda development harvester (https://datacatalogue-dev.cessda.eu/ ). 
Our new oai endpoint is **https://datacatalogue.sodanet.gr/oai  **
and our new catalogue is available here: https://datacatalogue.sodanet.gr/ 

The "list all records" is available here: https://datacatalogue.sodanet.gr/oai?verb=ListRecords&metadataPrefix=oai_ddi
We have already validated our ddi (ddi:codebook:2_5) with cessda validation tool and finally we have no validation errors. 
I include my colleagues Kostis and Nikos in the email chain, since they are key contributors to the aforementioned customisations and implementations.

Please inform us how we proceed. 

Best regards.

Dr. Apostolos (Tolis)  Linardis

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Tolis,

After the release of CDC v2.3 (which will include your Nesstar endpointoint) this week we will harvest your Dataverse endpoint via dev and staging.

Regards,

John

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Now that v2.3.0 has been released, these changes can be made

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


See commit bc56e16

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Need to remove records from NESSTAR endpoint and re-harvest

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Indices wiped for both DEV and Staging

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Tolis,

We've updated the SoDaNet endpoint details in our test instance, cleared the Elasticsearch indices and re-harvested, so now we only have your Dataverse records and not your Nesstar ones (as agreed, the production instance is harvesting your Nesstar endpoint).

An incremental harvest takes place in the early hours of every morning except Sunday (full harvest), so any published changes that you make to your records should be visible in the catalogue at the start of the next working day.

See https://datacatalogue-staging.cessda.eu/

Username:

Password:

Regards,

John

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Not ready for production yet

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Kostis,

My apologies, we needed to patch the production instance recently, so had to revert to your Nesstar endpoint temporarily.The pull request to put your Dataverse endpoint back into the test and staging environments was missed by me (I have mostly been on leave since then).This had now been remedied and you should be able to see the records from your Dataverse endpoint in staging (https://datacatalogue-staging.cessda.eu) from tomorrow morning.

Let me know if there are any further problems.

Regards,

John

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


@‌TainaFSD Please check SoDaNet records in Staging, and sign-off (or otherwise) inclusion of new endpoint in a future release.

@cessda-bitbucket-importer
Copy link
Author

Original comment by Taina Jääskeläinen.


This metadata record in Greek still appears in the English catalogue:

‘Access data’ buttons do not seem to go anywhere. Probably the study URL (i.e. the study description on the SP website) is missing which in CDC is harvested in the element:

  • /codeBook/stdyDscr/citation/holdings/@‌URI

@‌cfredzu1

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Dear John, 

first of all thank you for your support. Everything works fine with our DOIs from our pilot dataverse instance. I have one more question concerning the harvesting process. 

Concerning our new records (in dataverse), although they have the Terms of data access
 field filled in, this is not harvested by your platform. 

For example, in the record https://datacatalogue.sodanet.gr/oai?verb=GetRecord&identifier=doi:10.17903/FK2/U7FEOQ&metadataPrefix=oai_ddi you will observe that there is a section with the "terms" fields (see image below)

Although the field dataAccs above is filled in for a few records, this is not harvested by your platform ("Field is not available" message appears instead in your UI for this specific study).

Is there something that we should do? 

Thank you in advance,

Kostis

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).



@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


@matthew-morris-cessda Please have a look re missing field

@cessda-bitbucket-importer
Copy link
Author

Original comment by Matthew Morris (GitHub: matthew-morris-cessda).


The XPath used for the terms of data access is //ddi:codeBook//ddi:stdyDscr/ddi:dataAccs/ddi:useStmt/ddi:restrctn, whereas the XPath being used here is `//ddi:codeBook//ddi:stdyDscr/ddi:dataAccs.

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Kostis,

The XPath used by the Data Catalogue (as specified by the CESSDA Metadata Model) for the terms of data access is 
//ddi:codeBook//ddi:stdyDscr/ddi:dataAccs/ddi:useStmt/ddi:restrctn

whereas the XPath used in your records is 
//ddi:codeBook//ddi:stdyDscr/ddi:dataAccs.

Regards,

John

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


@matthew-morris-cessda please check XPATH used by AUSSDA and PROGEDO for data access. There may be an issue with Dataverse community custom and practice here.

@cessda-bitbucket-importer
Copy link
Author

Original comment by Matthew Morris (GitHub: matthew-morris-cessda).


Good spot. It turns out that AUSSDA made the same mistake as SoDaNet and uses //ddi:codeBook//ddi:stdyDscr/ddi:dataAccs.


On the other hand, PROGEDO has no further tags beyond <setAvail/>, and there is no content present in these tags.


@cessda-bitbucket-importer
Copy link
Author

Original comment by Matthew Morris (GitHub: matthew-morris-cessda).


See https://cessda.atlassian.net/browse/CDCSUP-7

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


From: konstantinos alexandris 
Date: Tue, 13 Apr 2021 at 21:13
Subject: Re: Sodanet OAI Dataverse Endpoint
To: John Shepherdson
Cc: Nikos Klironomos, Apostolos Linardis

Hi John,

thank you so much. We are already covered with your latest mail. We have only one pending issue concerning the "Terms of data access" field  in cessda datacatalogue, that remains empty for our dataverse records (due to to different xpath in ddi xml).

Thank you once again,

Kostis

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


@‌TainaFSD As the empty “Terms of data access" field is common to all records from Dataverse endpoints (unless some post-processing has been applied) due to XPath differences, are you happy for us to include this endpoint in the release?

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Latest CMV tests (BASIC validation gate, DDI2.5 PROFILE 1.0.4)

http://harvester.sodanet.gr:8080/oai/request?verb=GetRecord&identifier=http://nesstar-server.sodanet.gr:80/obj/fStudy/ekke-1-0043-GR&metadataPrefix=oai_ddi

'/codeBook/@‌xsi:schemaLocation' is mandatory
'/codeBook/stdyDscr/citation/titlStmt/titl' is mandatory
'/codeBook/stdyDscr/citation/titlStmt/titl/@‌xml:lang' is mandatory
'/codeBook/stdyDscr/citation/titlStmt/IDNo' is mandatory
'/codeBook/stdyDscr/citation/titlStmt/IDNo/@‌agency' is mandatory'/codeBook/stdyDscr/citation/holdings/@‌URI' is mandatory
'/codeBook/stdyDscr/citation/distStmt/distrbtr' is mandatory
'/codeBook/stdyDscr/citation/distStmt/distrbtr/@‌xml:lang' is mandatory
'/codeBook/stdyDscr/stdyInfo/abstract' is mandatory
'/codeBook/stdyDscr/stdyInfo/abstract/@‌xml:lang' is mandatory

@cessda-bitbucket-importer
Copy link
Author

Original comment by Taina Jääskeläinen.


Yes, please include the endpoint.

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


See also issue #345

@cessda-bitbucket-importer
Copy link
Author

Original comment by John Shepherdson (GitHub: john-shepherdson).


Included in release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants