Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible bug in SPARQL subqueries #607

Closed
pfps opened this issue Mar 15, 2016 · 8 comments
Closed

possible bug in SPARQL subqueries #607

pfps opened this issue Mar 15, 2016 · 8 comments
Labels
bug Something isn't working SPARQL
Milestone

Comments

@pfps
Copy link

pfps commented Mar 15, 2016

I think that SPARQL subqueries are not being correctly handled. It appears that the wrong variables are being projected from the subqueries and as well that the wrong variables end up being bound at the top level.

Here is a test file that shows the behaviour

query = """SELECT ?here ?there WHERE { ?here http://www.w3.org/ns/sh#p ?there }"""
print "\nDUMP property"
for row in testgraph.query(query): 
  print row

print "\nTwo chained triples"
query = """SELECT ?here ?there ?elsewhere
WHERE { ?here  http://www.w3.org/ns/sh#p ?there .
        ?there http://www.w3.org/ns/sh#p ?elsewhere . }"""
for row in testgraph.query(query): print row

print "\nSubquery"
query = """SELECT *
WHERE { ?here  http://www.w3.org/ns/sh#p ?there .
        { SELECT ?there
          WHERE { ?there http://www.w3.org/ns/sh#p ?elsewhere . } } } """
for row in testgraph.query(query): print row

print "\nSubquery"
query = """SELECT ?here ?there
WHERE { ?here  http://www.w3.org/ns/sh#p ?there .
        { SELECT ?there
          WHERE { ?there http://www.w3.org/ns/sh#p ?elsewhere . } } } """
for row in testgraph.query(query): print row

print "\nSubquery with variable renaming and disconnected variables"
query = """SELECT ?here ?there
WHERE { ?here  http://www.w3.org/ns/sh#p ?there .
        { SELECT (?here AS ?there)
          WHERE { ?here http://www.w3.org/ns/sh#p ?there . } } } """
for row in testgraph.query(query): print row

and the output on a very small graph

DUMP property
(rdflib.term.URIRef(u'http://www.w3.org/ns/sh#a'), rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'))
(rdflib.term.URIRef(u'http://www.w3.org/ns/sh#x'), rdflib.term.URIRef(u'http://www.w3.org/ns/sh#y'))
(rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'), rdflib.term.URIRef(u'http://www.w3.org/ns/sh#c'))

Two chained triples
(rdflib.term.URIRef(u'http://www.w3.org/ns/sh#a'), rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'), rdflib.term.URIRef(u'http://www.w3.org/ns/sh#c'))

Subquery
(rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'), None)

Subquery
(None, rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'))

Subquery with variable renaming and disconnected variables
(None, rdflib.term.URIRef(u'http://www.w3.org/ns/sh#b'))
(None, rdflib.term.URIRef(u'http://www.w3.org/ns/sh#y'))
(None, rdflib.term.URIRef(u'http://www.w3.org/ns/sh#c'))

I believe that the each of these should have a single solution, namely a projection of <a,b,c>.

@joernhees
Copy link
Member

hmm, i have trouble to reproduce and understand what is actually expected and what is failing...

would be cool if you could write a minimal test case e.g. like https://github.com/RDFLib/rdflib/blob/master/test/test_issue579.py

@pfps
Copy link
Author

pfps commented Mar 15, 2016

Will do

On 03/15/2016 12:49 PM, Jörn Hees wrote:

hmm, i have trouble to reproduce and understand what is actually expected and
what is failing...

would be cool if you could write a minimal test case e.g. like
https://github.com/RDFLib/rdflib/blob/master/test/test_issue579.py


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#607 (comment)

@pfps
Copy link
Author

pfps commented Mar 15, 2016

OK, here is a file that OK, here is a file that shows an issue

Hmm. I keep on getting "Something went really wrong, and we can't process that file", so I guess I'll just have to include it here.

import rdflib
from rdflib import Graph, URIRef, Literal, Namespace

SH = Namespace("http://bug/")

graph = Graph()
graph.bind('sh',SH)
graph.add((SH.a,SH.p,SH.b))
graph.add((SH.b,SH.p,SH.c))
graph.add((SH.x,SH.p,SH.y))

# finding a chain of length three

q1 = """SELECT ?here ?there ?elsewhere
WHERE { ?here  sh:p ?there .
        ?there sh:p ?elsewhere . }"""
r1 = graph.query(q1)
for row in r1 : r1r = row; print row

# also should find _and report_ the chain, but doesn't report correctly

q2 = """SELECT ?here ?there ?elsewhere
WHERE { ?here  sh:p ?there .
        { SELECT ?there ?elsewhere WHERE { ?there sh:p ?elsewhere . } } }"""
r2 = graph.query(q2)
for row in r2: r2r = row; print row

# also should find _and report_ the chain, and does

q3 = """SELECT ?here ?there ?elsewhere
WHERE { { SELECT ?there ?elsewhere WHERE { ?there sh:p ?elsewhere . } } 
    ?here  sh:p ?there . }"""
r3 = graph.query(q3)
for row in r3: r3r = row; print row

assert r1r[0] == r2r[0]
assert r1r[1] == r2r[1]
assert r1r[2] == r2r[2]

assert r3r[0] == r2r[0]
assert r3r[1] == r2r[1]
assert r3r[2] == r2r[2]

@pfps
Copy link
Author

pfps commented Mar 15, 2016

bug.txt

And now it works!

joernhees added a commit to joernhees/rdflib that referenced this issue Mar 18, 2016
joernhees added a commit to joernhees/rdflib that referenced this issue Mar 18, 2016
@joernhees
Copy link
Member

i took the liberty to edit your comment ... have a look at markdown syntax please (it will take you 2 minutes and make your reports so much more readable)

i made a test to reproduce this in #610

@joernhees joernhees added the bug Something isn't working label Mar 18, 2016
@joernhees joernhees added this to the rdflib 4.2.2 milestone Mar 18, 2016
@pfps
Copy link
Author

pfps commented Mar 22, 2016

Thanks for the editing. I tried to attach the bug code but it didn't work the first time so I just shoved it in the message, which I guess triggered some markdown proccessing.

I have another report coming. I'll try to get it formatted better from the beginning.

@gromgull
Copy link
Member

doh - I added the test twice. Well, if they now BOTH pass we know it worked! :D

@gromgull gromgull reopened this Jan 20, 2017
@gromgull
Copy link
Member

total chaos. It worked locally - but I had some uncommitted changes. This will teach me not to do a PR.

joernhees added a commit that referenced this issue Jan 25, 2017
* master: (44 commits)
  quote cleanup OCD
  serializer/parser alias for 'ntriples'
  serializer/parser alias for 'ttl'
  cleanup
  remove outdated always skipped test
  a bit of changelog
  add a NTSerializer sub-class for nt11 (#700)
  Restrict normalization to unicode-compatible values (#674)
  fixes for turtle/trig namespace handling
  skip serialising empty default graph
  skip round-trip test, unfixable until 5.0
  prefix test for #428
  Added additional trig unit tests to highlight some currently occurring issues.
  remove ancient and broken 2.3 support code. (#681)
  updating deprecated testing syntax (#697)
  docs: clarify the use of an identifier when persisting a triplestore (#654)
  removing pyparsing version requirement (#696)
  made min/max aggregate functions support all literals (#694)
  actually fix projection from sub-queries
  added dawg tests for #607
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SPARQL
Projects
None yet
Development

No branches or pull requests

3 participants