Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very suspicious condition evaluation and error 🪄 #1563

Closed
ExtReMLapin opened this issue Apr 23, 2024 · 9 comments
Closed

Very suspicious condition evaluation and error 🪄 #1563

ExtReMLapin opened this issue Apr 23, 2024 · 9 comments

Comments

@ExtReMLapin
Copy link
Contributor

ExtReMLapin commented Apr 23, 2024

Hello,

(TLDR at the bottom)

Following issue #1562 , same database, same environement.

The database is available in #1562

As a reminder here was the query ran by my coworker.

MATCH (n) WHERE ('NER' IN labels(n) OR 'THEME' IN labels(n)) AND (toLower(n.identity) CONTAINS 'technology' AND toLower(n.identity) CONTAINS 'science' AND toLower(n.identity) CONTAINS 'notable advancements' AND toLower(n.identity) CONTAINS '2020') RETURN n.identity as identity, ID(n) as id

This query can be simplified and fixed a little bit by using (n:NER OR n:THEME) instead of using labels()

Now this specific query errors:

MATCH (n) WHERE (n:NER OR n:THEME) AND
(
    toLower(n.identity) CONTAINS 'technology' AND
    toLower(n.identity) CONTAINS 'science' AND
    toLower(n.identity) CONTAINS 'notable advancements' AND
    toLower(n.identity) CONTAINS '2020'
)
RETURN n

What's really weird is that all NER and all THEME nodes should contain identity property, so it should not error, so maybe the issue reported in #1562 is another bug that the one we're metting here, as the two following querries works fine

MATCH (n:NER) WHERE
(
    toLower(n.identity) CONTAINS 'technology' AND
    toLower(n.identity) CONTAINS 'science' AND
    toLower(n.identity) CONTAINS 'notable advancements' AND
    toLower(n.identity) CONTAINS '2020'
)
RETURN n

(returns nothing but no error)

MATCH (n:THEME) WHERE
(
    toLower(n.identity) CONTAINS 'technology' AND
    toLower(n.identity) CONTAINS 'science' AND
    toLower(n.identity) CONTAINS 'notable advancements' AND
    toLower(n.identity) CONTAINS '2020'
)
RETURN n

(returns nothing but no error)

So the first conclusions we can have here is that individually running this query on NER and THEME nodes classes works but both at the same time isn't working ?

My conclusion here was that the Cypher engine is evaluating all the conditions even if the first one (the one that check if it's NER or a THEME) doesn't "pass".

Why ? because when I run this query :

MATCH (n) WHERE n.identity IS NULL return n

returns only IMAGE nodes where there is no identity field/property.

So from there, my conclusions was as stated before that it was evaluations all conditions ignoring the order.

TLDR START

Howhever, this is where is gets weird.

Remember this query ?

MATCH (n) WHERE (n:NER OR n:THEME) AND
(
    toLower(n.identity) CONTAINS 'technology' AND
    toLower(n.identity) CONTAINS 'science' AND
    toLower(n.identity) CONTAINS 'notable advancements' AND
    toLower(n.identity) CONTAINS '2020'
)
RETURN n

When, when you remove ANY of the 4 conditions between the parentheses, it doesn't error anymore.

@ExtReMLapin
Copy link
Contributor Author

ExtReMLapin commented Apr 23, 2024

firefox_GYdA4mwyaf.mp4
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]: com.arcadedb.exception.CommandParsingException: Error on executing Cypher query
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.cypher.query.CypherQueryEngine.command(CypherQueryEngine.java:80)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.database.LocalDatabase.command(LocalDatabase.java:1348)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.server.ServerDatabase.command(ServerDatabase.java:472)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.server.http.handler.PostCommandHandler.executeCommand(PostCommandHandler.java:131)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.server.http.handler.PostCommandHandler.execute(PostCommandHandler.java:110)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.server.http.handler.DatabaseAbstractHandler.execute(DatabaseAbstractHandler.java:100)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at com.arcadedb.server.http.handler.AbstractServerHttpHandler.handleRequest(AbstractServerHttpHandler.java:127)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at io.undertow.server.Connectors.executeRootHandler(Connectors.java:393)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:859)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at org.xnio.XnioWorker$WorkerThreadFactory$1$1.run(XnioWorker.java:1282)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]:         at java.base/java.lang.Thread.run(Thread.java:829)
avril 23 14:30:19 cfia-IDEXTEND server.sh[996]: Caused by: java.lang.NullPointerException

@ExtReMLapin ExtReMLapin changed the title Very suspicious condition evaluation and error Very suspicious condition evaluation and error 🪄 Apr 23, 2024
@ExtReMLapin
Copy link
Contributor Author

It really doesn't make any sense.

Execute the query 4 times, will error on the 4th time, not the 3 first times.

New query :

MATCH (n)
WHERE (n:NER OR n:THEME)
  AND (n.identity IS NOT NULL)
  AND (toLower(n.identity) CONTAINS 'science')
  AND (toLower(n.identity) CONTAINS 'notable advancements')
  AND (toLower(n.identity) CONTAINS '2020')
  AND (toLower(n.identity) CONTAINS '2021')
RETURN n

Video here where i'm just sending the same request OVER and OVER, sometime it fails, sometimes it doesn't.

putty_1eb0NBEAhI.mp4

@lvca
Copy link
Contributor

lvca commented Apr 23, 2024

I tried al your queries against the provided database but I don't see any error against the latest release. Have you tried it with the latest main branch with yesterday's fix? I'm assuming the same bug was responsible for this.

@lvca
Copy link
Contributor

lvca commented Apr 23, 2024

@ExtReMLapin
Copy link
Contributor Author

ExtReMLapin commented Apr 23, 2024

Thanks for the quick answer, I downloaded the latest snapshot arcadedb-package-24.5.1-20240423.041740-4.zip , didn't know there were automatic builds.

However the snapshot doesn't include cypher and studio, and merging with latest release doesn't help as it reports the release version, not the headless one, do you have a 90mb full release somewhere ?

Edit : i guess I can find them in the others folder, my bad, downloading them

@ExtReMLapin
Copy link
Contributor Author

As for yesterday's fix, I can confirm the issue is not happening anymore.
As for this issue and the very weird behavior cited above, I do not meet this issue anymore with the snapshots, weird.

I feel there was still an underlying bug, if the first condition was correctly evaluated (only filter NER and THEME) how could it error then ?

Don't you think there is something else going on ? The error in the first place should never have happened as all NER and THEME nodes have the "indentity" property.

@lvca
Copy link
Contributor

lvca commented Apr 23, 2024

The issue I fixed 2d ago was on a NPE in the cypher-gremlin project. It's possible the recent issue you experienced was the same. About the later question, the cypher module couldn't be that "smart" and execute all the conditions. Or maybe the conditions are not kept in a list but shuffled (in a set)? So the order can be pseudo-casual?

@ExtReMLapin
Copy link
Contributor Author

About the later question, the cypher module couldn't be that "smart" and execute all the conditions. Or maybe the conditions are not kept in a list but shuffled (in a set)? So the order can be pseudo-casual?

The later theory sound plausible as as I showed in the video, running the same query 10 times will not cause an error 10 times.
That sounds really problematic as the Short-circuit evaluation principle/rule doesn't seem to be respected.

Again, thank you for the time spent improving ArcadeDB

@lvca
Copy link
Contributor

lvca commented Apr 23, 2024

Unfortunately, the Cypher module is a separate module, and the OpenCypher team seems not interested in working on it. All we can do is little patches until somebody comes up with a complete rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants