Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Wikidata URIs via LDF endpoint and LC URIs using HEAD requests #921

Merged
merged 10 commits into from
Feb 7, 2020

Conversation

osma
Copy link
Member

@osma osma commented Feb 5, 2020

This PR refactors and modularizes the lookup/resolution mechanism for remote URIs, that is, URIs that Skosmos knows nothing about.

There is a new Resolver class which knows which resolution mechanism to use for a particular URI. Wikidata URIs are looked up via the LDF endpoint (fixes #912); HEAD requests are used for LCSH and other LC vocabularies (fixes #915); and for everything else, a generic Linked Data mechanism (just retrieve the URI and ask for RDF) is used, as before.

Unfortunately, it turns out that the Wikidata LDF endpoint cannot currently handle language tags. I've reported this to the Wikidata list. If the issue isn't fixed soon, then that mechanism has to be disabled for now - perhaps the Wikidata SPARQL endpoint could be used instead, although it may be less reliable than the LDF endpoint.

I'm keeping this just as a draft PR for now, until there is more clarity on the LDF endpoint issue. I just want to see the automated QA reports.

@osma osma added this to the 2.3 milestone Feb 5, 2020
@osma
Copy link
Member Author

osma commented Feb 5, 2020

Codacy Here is an overview of what got changed by this pull request:

Issues
======
+ Solved 5
- Added 10
           

Complexity increasing per file
==============================
- model/ConceptPropertyValue.php  1
- model/resolver/Resolver.php  3
- model/resolver/LDFResource.php  2
- model/resolver/LOCResource.php  4
- model/resolver/LinkedDataResource.php  2
         

See the complete overview on Codacy

$res = new LOCResource($uri);
} elseif ($this->startsWith('http://www.wikidata.org/entity/', $uri)) {
$res = new LDFResource($uri, 'https://query.wikidata.org/bigdata/ldf');
} else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// change the timeout setting for external requests
$httpclient = EasyRdf\Http::getDefaultHttpClient();
$httpclient->setConfig(array('timeout' => $timeout));
EasyRdf\Http::setDefaultHttpClient($httpclient);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// 1. Unregister the legacy RDF/JSON parser, we don't want to use it
EasyRdf\Format::unregister('json');
// 2. Add "application/json" as a possible MIME type for the JSON-LD format
$jsonld = EasyRdf\Format::getFormat('jsonld');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// change the timeout setting for external requests
$httpclient = EasyRdf\Http::getDefaultHttpClient();
$httpclient->setConfig(array('timeout' => $timeout));
EasyRdf\Http::setDefaultHttpClient($httpclient);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


try {
// change the timeout setting for external requests
$httpclient = EasyRdf\Http::getDefaultHttpClient();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public function resolve(int $timeout) : ?EasyRdf\Resource {
try {
// change the timeout setting for external requests
$httpclient = EasyRdf\Http::getDefaultHttpClient();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$httpclient->setConfig(array('timeout' => $timeout));
EasyRdf\Http::setDefaultHttpClient($httpclient);

$graph = EasyRdf\Graph::newAndLoad(EasyRdf\Utils::removeFragmentFromUri($this->uri));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
if ($this->isExternal()) {
return $this->vocab;
} else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public function resolve(int $timeout) : ?EasyRdf\Resource {
// prevent parsing errors for sources which return invalid JSON (see #447)
// 1. Unregister the legacy RDF/JSON parser, we don't want to use it
EasyRdf\Format::unregister('json');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'user_agent' => 'Skosmos',
'timeout' => $timeout));
$context = stream_context_create($opts);
$fd = fopen($this->uri, 'rb', false, $context);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov-io
Copy link

codecov-io commented Feb 7, 2020

Codecov Report

Merging #921 into master will decrease coverage by 6.07%.
The diff coverage is 34.4%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #921      +/-   ##
============================================
- Coverage     65.76%   59.69%   -6.08%     
- Complexity     1469     1488      +19     
============================================
  Files            27       32       +5     
  Lines          3637     4037     +400     
============================================
+ Hits           2392     2410      +18     
- Misses         1245     1627     +382
Impacted Files Coverage Δ Complexity Δ
model/resolver/WDQSResource.php 0% <0%> (ø) 3 <3> (?)
model/Concept.php 81.13% <0%> (ø) 184 <0> (ø) ⬇️
model/resolver/LOCResource.php 0% <0%> (ø) 6 <6> (?)
model/ConceptMappingPropertyValue.php 72.8% <100%> (-8.57%) 41 <0> (ø)
model/Model.php 83.33% <100%> (ø) 105 <0> (-2) ⬇️
model/resolver/RemoteResource.php 100% <100%> (ø) 1 <1> (?)
model/ConceptPropertyValue.php 79.51% <36.36%> (-5.62%) 46 <3> (+2)
model/resolver/LinkedDataResource.php 73.33% <73.33%> (ø) 3 <3> (?)
model/resolver/Resolver.php 91.66% <91.66%> (ø) 6 <6> (?)
model/sparql/GenericSparql.php 68.07% <0%> (-24.81%) 302% <0%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d81f9d8...d90d0d5. Read the comment docs.

@osma osma force-pushed the issue912-lookup-wikidata-ldf branch from 264ce87 to 63932b8 Compare February 7, 2020 07:44
@osma osma closed this Feb 7, 2020
@osma osma reopened this Feb 7, 2020
@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 7, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 2 Security Hotspots to review)
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@osma osma marked this pull request as ready for review February 7, 2020 12:27
@osma osma merged commit c13dee7 into master Feb 7, 2020
@osma osma deleted the issue912-lookup-wikidata-ldf branch February 7, 2020 12:28
osma added a commit that referenced this pull request Feb 10, 2020
osma added a commit that referenced this pull request Feb 10, 2020
fix some test coverage annotations after PR #921
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Faster lookup of LCSH labels using HEAD requests Faster lookup of Wikidata URIs using LDF endpoint
2 participants