-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: Optimize the generated SPARQL queries #124
Refactor: Optimize the generated SPARQL queries #124
Conversation
include_bnodes can also be found in the solution mapper
Hi @mdorf, I think you are the one working on this part of the project. Could you please give me your opinion about this? |
@syphax-bouazzouni, the solution appears to be sound based on the description. But, honestly, it's a bit difficult for me to provide a constructive feedback based on the number of commits. I need to understand how the original issue affected our system and how the solution addresses it. I wonder if this is a good topic to review during our face-to-face workshop in Montpellier in September. |
Also, we need to make sure that ncbo/ontologies_linked_data and ncbo/ontologies_api tests pass with AllegroGraph with this change in place. |
I do agree in principle to this. However, I don't think this should be a stopper (or slower) for not taking the change for 4store. Basically we expect this to fix the call : |
Yeah, I understand. For this kind of matter, I am totally available to do specific meetings to speak about it.
For AllegroGraph, we have not yet set up an environment to test it, but naively I will say that it should also work for it because its basic SPARQL queries that didn't change a lot from before.
@jonquet for now the code for 4store and AllegroGraph are the same, so we can't isolate this improvement for only 4store |
Cc @vemonet with who we originally found the bug! And congrats to @syphax-bouazzouni for such an investigation and fix. |
Wow nice to see it finally got resolved for everyone! Good job @syphax-bouazzouni, that was a consequent task |
Preamble
This PR includes this
And require this
Starting issues
The generated SPARQL query to fetch data is not optimized because in the select clause we print all the attributes which causes the result to have a lot of rows (the number of rows is equal to the multiplication of the number of results of each attribute)
For example, if we have for a resource (e.g a Class) 3 attributes (e.g prefLabel, synonyms, definitions) with the following values:
prefLabel = prefLabel
synonyms = [syn_1, syn_2, syn_3]
definitions = [def_1, def_2]
We will get the following request results :
| prefLabel | syn_1 | def_1 |
| prefLabel | syn_2 | def_1 |
| prefLabel | syn_3 | def_1 |
| prefLabel | syn_1 | def_2 |
| prefLabel | syn_2 | def_2 |
| prefLabel | syn_3 | def_2 |
With this new implementation, we will have the following results
prefLabel
syn_1
syn_2
syn_3
def_1
def_2
Solution
We will transform all the requests to
And for inverse attributes, we will use the following request
Solution issue
Theoretically, it should work, but the goo test_inverse.rb doesn’t pass because of a bug of 4store, where the ?attributeproperty was always returning
resource not found
To reproduce the bug
Alternative and implemented solution for now
We will use UNION with binds for each attribute of a resource, as so
Include/general query
Aggregate query
unchanged
BNODE query
unchanged
FILTER query (
test/test_where#test_filter
)ORDER BY query (TODO)
unmapped query (
test/test_where#test_filter
)before
new it is the same as the general/includes a query
Equivalent predicates case
see *
def self*.expand_equivalent_predicates(query, eq_p)
Before the filter was in the OPTIONAL clause
Now we add a UNION clause for each equivalent predicate
Use case
With the use case of AGRVOC (2GB of triples) :