neo4j.exceptions.ClientError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER #335

robertlagrant · 2018-05-15T08:45:57Z

A funny one:

We have a 3-node causal-clustered neo4j setup
I've changed the routing protocol to be bolt+routing
We're using Neomodel with @db.transaction

We're getting intermittent errors as per the issue title - i.e. it's trying to write to a follower node, and presumably bolt+routing isn't sending the transaction to the leader.
Am I missing something? Is it that if the first interaction with the database is a read, that it opens the transaction on a follower node? Can I force it to the leader for every transaction?

The text was updated successfully, but these errors were encountered:

robertlagrant · 2019-01-04T22:26:25Z

We are still getting this issue, even when forcing a write transaction.

I've created a repro case: https://github.com/robertlagrant/neo4j-cluster-failure. Please test.

aanastasiou · 2019-01-09T10:01:10Z

@robertlagrant Would it be possible to share a little bit more information on your cluster configuration? Is that supposed to be 3 CORE servers? There are some conditions where what you describe might be the intended behaviour at least as far as RAFT is concerned (i.e. see this). I am trying to see how much of this can be dealt with at the level of neomodel and how much of this is external to it.

mvanderkroon · 2019-01-09T10:55:56Z

Please see https://neo4j.com/docs/ogm-manual/current/reference/ (section 3.14.1.6. Retry mechanisms).

For critical applications, these failures have to be anticipated, and also managed at the architecture or application level. Even if the driver handles some low level retries, it is not always enough in case of instability, as an application may involve complex business logic, and require coarse grained units of work.

In other words, the driver does not deal with higher level failures (such as cluster disconnects). In our use cases we have worked around this by adding custom retry logic to our business logic. See very basic example down below (adding jitter and exponential backoff obviously highly recommended).

sts = time.time()
while True:
    last_exception = None
    cts = time.time()

    if cts - sts > _MAX_RETRY_SECONDS:
        raise last_exception

    try:
        session.write_transaction(do_write())
        break
    except Exception as e:
        time.sleep(1)
        last_exception = e

aanastasiou · 2019-01-09T12:26:23Z

@mvanderkroon Thank you very much, sounds like a modification is required at this point (?).

mvanderkroon · 2019-01-09T13:13:39Z

@aanastasiou I believe so. I have forked the repo, made the necessary changes and would be quite happy to issue a pull request. Should I point it to your master branch?

aanastasiou · 2019-01-09T13:31:42Z

@mvanderkroon Thank you very much and I do not see why not. It should be sent as a pull request to the main neomodel repo. All the best.

robertlagrant · 2019-01-14T00:29:08Z

@aanastasiou sure - it's a 3 core server cluster. There are also 2 read replicas, but they don't really feature in this situation as far as I'm aware.

aanastasiou · 2019-01-14T15:17:46Z

@robertlagrant Thank you for your response, I think that the discussion with @mvanderkroon on the pull request was very informative about the specifics.

kant111 · 2019-08-02T07:18:16Z

Why follower cannot accept writes?

robertlagrant · 2020-03-11T09:51:13Z

@kant111 because that's not how Neo4J works.

ayoubelmimouni · 2020-12-13T16:51:24Z

when using a connection URL of bolt+routing:// this indicates the session is now cluster aware, whereas bolt:// does not understand the other members in a cluster.
However it is not simply the bolt+routing:// connection URL is only half the story. It is also the usage of session.readTransaction() and session.writeTransaction() whereby each allows you to pass the Cypher to be executed. If you send a cypher statement through session.writeTransaction and the connection URL was bolt+routing:// then regardless of the member connected to, the Cypher write statement will be routed to the LEADER. As such if one connects to bolt+routing:// and calls a session.writeTransaction() as the transaction is defined as a write it will automatically be routed to the LEADER.
It is important to note that Neo4j does not parse the Cypher statement to auto detect if the Cypher is a read or write statement.
So one could actually issue a session.readTransaction("create (n:Person {id:1})") and because it is defined as a 'readTransaction` it would be routed to a Follower, but then fail since only LEADERs can perform writes.

gwvandesteeg · 2022-12-01T08:43:24Z

Fun fact (tested on Neo4J 4.0.7)

Adding a trigger can only be done on the node in the cluster that is the LEADER of both the DB you are adding the trigger to AND the system database (might need the neo4j DB as well, wasn't sure, but we don't use it).

The example below is me trying to add a trigger whilst connected to the node neo4j-core-2 via the bolt connector

neo4j@nextvoice> call dbms.cluster.overview();
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                     | addresses                                                                                                                | databases                                                      | groups |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "53f95bdf-0c86-4826-8244-4ad4f7963592" | ["bolt://neo4j-core-2.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-2.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "LEADER", neo4j: "FOLLOWER", system: "FOLLOWER"}   | []     |
| "6b74a7fa-626d-4994-af32-1432b9e8b0c4" | ["bolt://neo4j-core-0.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-0.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "LEADER", system: "LEADER"}     | []     |
| "775b45fe-3ae3-466d-9ad2-7b8e5ae82e0b" | ["bolt://neo4j-core-1.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-1.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "FOLLOWER", system: "FOLLOWER"} | []     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3 rows available after 6 ms, consumed after another 1 ms
neo4j@nextvoice> CALL apoc.trigger.add(
                 "assertExtensionNumberValidNumericalString",
                 "WITH '^([0-9]{2,5})$' AS extNumStrRegex
                 MATCH (e:Extension)
                 CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
                 RETURN NULL",
                 { phase: 'before' }
                 );
No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER

After a bunch of killing nodes and waiting for them to come back to the desired state, and connected to neo4j-core-0 via the bolt connector

neo4j@nextvoice> call dbms.cluster.overview();
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                     | addresses                                                                                                                | databases                                                      | groups |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "53f95bdf-0c86-4826-8244-4ad4f7963592" | ["bolt://neo4j-core-2.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-2.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "FOLLOWER", system: "FOLLOWER"} | []     |
| "6b74a7fa-626d-4994-af32-1432b9e8b0c4" | ["bolt://neo4j-core-0.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-0.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "LEADER", neo4j: "LEADER", system: "LEADER"}       | []     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

2 rows available after 0 ms, consumed after another 1 ms
neo4j@nextvoice> CALL apoc.trigger.add(
                 "assertExtensionNumberValidNumericalString",
                 "WITH '^([0-9]{2,5})$' AS extNumStrRegex
                 MATCH (e:Extension)
                 CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
                 RETURN NULL",
                 { phase: 'before' }
                 );
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name                                        | query                                                                                                                                                                              | selector          | params | installed | paused |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "assertExtensionNumberValidNumericalString" | "WITH '^([0-9]{2,5})$' AS extNumStrRegex
MATCH (e:Extension)
CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
RETURN NULL" | {phase: "before"} | {}     | TRUE      | FALSE  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 row available after 10 ms, consumed after another 30 ms

robertlagrant mentioned this issue May 16, 2018

Explicit write transaction mode #337

Merged

robertlagrant mentioned this issue Jan 8, 2019

Upgrading to version 3.3.0 causes a KeyError on initialising connection to neo4j DB #378

Closed

mvanderkroon mentioned this issue Jan 9, 2019

[WIP] Feature/neo retry logic #398

Closed

P1zz4br0etch3n mentioned this issue Jan 22, 2019

NotALeaderError in write_transaction() neo4j/neo4j-python-driver#276

Closed

aanastasiou added the enhancement label Mar 7, 2019

aanastasiou mentioned this issue Nov 17, 2019

Add support for bookmarks in transactions #478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

neo4j.exceptions.ClientError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER #335

neo4j.exceptions.ClientError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER #335

robertlagrant commented May 15, 2018

robertlagrant commented Jan 4, 2019

aanastasiou commented Jan 9, 2019

mvanderkroon commented Jan 9, 2019 •

edited

Loading

aanastasiou commented Jan 9, 2019

mvanderkroon commented Jan 9, 2019

aanastasiou commented Jan 9, 2019

robertlagrant commented Jan 14, 2019

aanastasiou commented Jan 14, 2019

kant111 commented Aug 2, 2019

robertlagrant commented Mar 11, 2020

ayoubelmimouni commented Dec 13, 2020

gwvandesteeg commented Dec 1, 2022

neo4j.exceptions.ClientError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER #335

neo4j.exceptions.ClientError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER #335

Comments

robertlagrant commented May 15, 2018

robertlagrant commented Jan 4, 2019

aanastasiou commented Jan 9, 2019

mvanderkroon commented Jan 9, 2019 • edited Loading

aanastasiou commented Jan 9, 2019

mvanderkroon commented Jan 9, 2019

aanastasiou commented Jan 9, 2019

robertlagrant commented Jan 14, 2019

aanastasiou commented Jan 14, 2019

kant111 commented Aug 2, 2019

robertlagrant commented Mar 11, 2020

ayoubelmimouni commented Dec 13, 2020

gwvandesteeg commented Dec 1, 2022

mvanderkroon commented Jan 9, 2019 •

edited

Loading