Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow blank nodes to be reused across expressions #34

Closed
RubenVerborgh opened this issue Aug 29, 2019 · 13 comments
Closed

Allow blank nodes to be reused across expressions #34

RubenVerborgh opened this issue Aug 29, 2019 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Aug 29, 2019

Currently, a blank node resulting from one expression, cannot be reused in another. Following the example of https://github.com/solid/query-ldflex/issues/33, the following snippets return different results:

const alice = "https://drive.verborgh.org/public/2019/blanks.ttl#Alice";
for await (const name of solid.data[alice].friends.name)
  console.log(`${name}`)
const alice = "https://drive.verborgh.org/public/2019/blanks.ttl#Alice";
for await (const friend of solid.data[alice].friends)
  console.log(`${await friend.name}`)

This happens because Alice's friends are identified by blank nodes, and they lose context across multiple expressions. If we retry both snippets with Alice2, which has IRI friends, we get the same results.


We could strive to reuse blank nodes across expressions, by internally skolemizing them. Here is a sketch of how that could work:

  • When outputting blank nodes, Comunica assigns an internal identifier to them. For instance, _:b1 is still output as a BlankNode, but has a special internal field .skolemized that contains urn:skolem:1234.
  • When a SPARQL query is generated from such a skolemized blank node, the skolemized IRI is used instead of a blank node.
  • When returning results, any skolemized NamedNode is turned into a skolemized `BlankNode.

The key is inserting skolemization and deskolemization processing in the right place, for which I need to ask @rubensworks for help.

We could simply skolemize upon parsing, and then deskolemize right before results are returned. This works in all cases, except when Comunica directly operates on a store (the contents of which it did not parse, so it can contain actual blank nodes).

And alternative approach is a skolemizing store wrapper. It takes a store as an argument, and translates on the fly in its match etc. methods.

Perhaps both approaches can be used in conjunction: skolemization in parsers for all cases, except when a store is passed, then we wrap it.

@RubenVerborgh RubenVerborgh added the enhancement New feature or request label Aug 29, 2019
@RubenVerborgh RubenVerborgh changed the title Allow blank nodes do be reused across expressions Allow blank nodes to be reused across expressions Aug 29, 2019
@rubensworks
Copy link
Contributor

Skolemization does indeed sound the right way to go here.

This issue in Comunica would be a requirement for this: comunica/comunica#355
And (AFAICS) that's probably everything that needs to be done, as all request go via the federated actor.

@justinwb
Copy link

justinwb commented Mar 9, 2020

Checking to see if this issue has been slated for work in the near term? We're running into some use cases that are blocked by this.

@RubenVerborgh
Copy link
Member Author

@justinwb Will schedule it in @rubensworks' agenda.

@RubenVerborgh
Copy link
Member Author

Or maybe also @joachimvh, they can decide who is most appropriate.

@rubensworks
Copy link
Contributor

I've started working on the solution described in comunica/comunica#355.

This will mean that Comunica will not output any blank nodes anymore originating from sources (unless enforced by SPARQL via BNODE()). (The solution you describe would solve the problem here, but not the problem described in comunica/comunica#355)

As far as I know, this shouldn't be a problem for users downstream, and SPARQL spec-compliant.
@RubenVerborgh Do you agree?

@RubenVerborgh
Copy link
Member Author

This will mean that Comunica will not output any blank nodes anymore originating from sources

That by itself seems too strong? Should at least be a switch (off by default to have correct query semantics)?

SPARQL spec-compliant

Do we have evidence? Perhaps something @Dexagod might want to dive into?

@rubensworks
Copy link
Contributor

Hmm, I just realized there is a isBlank function. If we'd skolemize everything, this would definitely change the semantics of the query, which is not what we want.
So skolemizing everything may indeed be too radical. The internal .skolemized field may be a better solution.

rubensworks added a commit to comunica/comunica that referenced this issue Mar 16, 2020
When federating over multiple sources,
the outcoming blank nodes coming from each source
will receive a distinct blank node,
so that they can not be joined across different sources,
even if they have the same blank node label.

A reverse translation also takes place for incoming queries with blank nodes,
so that these blank nodes will only match if they come from that source.

Blank nodes coming from sources will receive a .skolemized field
containing a named node.
This named node can be queried again as an IRI, and this will
be interpreted by Comunica as a blank node corresponding to the proper
source, assuming that the array of sources remains the same.

Closes #355
Required for LDflex/Query-Solid#34
@justinwb
Copy link

Saw a commit go in at comunica/comunica#624 - is the expectation that this will provide the full resolution or is there now additional work to do in ldflex to take advantage of the changes?

@RubenVerborgh
Copy link
Member Author

@justinwb That should be it, mostly. We'd still need to add LDflex support for the .skolemized field for some cases, and test.

@rubensworks
Copy link
Contributor

Comunica 1.11.0 has now been released with this new feature, so implementing .skolemized support into LDflex should be possible now.

@justinwb
Copy link

justinwb commented Apr 6, 2020

Comunica 1.11.0 has now been released with this new feature, so implementing .skolemized support into LDflex should be possible now.

Awesome thanks for the update! Is anyone slotted to add .skolemized to ldflex at this point or is it still on backlog?

@RubenVerborgh
Copy link
Member Author

Still on backlog currently; will discuss with the team.

@RubenVerborgh
Copy link
Member Author

@justinwb You can follow progress in #64
System test with the above example was added; presently hitting a skolemized issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants