Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed: Edges do not replicate #5249

Closed
nethertek opened this issue Oct 30, 2015 · 14 comments
Closed

Distributed: Edges do not replicate #5249

nethertek opened this issue Oct 30, 2015 · 14 comments
Assignees
Labels

Comments

@nethertek
Copy link

Edges do not replicate in distributed mode. The following works only with only one node online. When other nodes are online, edges to not replicate and the local vertex table does not get updated.

Server 1:
insert into Post (content, timestamp) values ('test', 1)
Inserted record 'Post#28:0{content:test,timestamp:1} v1' in 0.155000 sec(s).

Server 1:
select from Post
----+-----+------+-------+---------
#   |@RID |@CLASS|content|timestamp
----+-----+------+-------+---------
0   |#28:0|Post  |test   |1        
----+-----+------+-------+---------

Server 2:
select from Post
----+-----+------+-------+---------
#   |@RID |@CLASS|content|timestamp
----+-----+------+-------+---------
0   |#28:0|Post  |test   |1        
----+-----+------+-------+---------

Server 1:
create edge Own from #32:0 to #28:0
Created edge '[Own#51:0{out:#32:0,in:#28:0} v1]' in 0.018000 sec(s).

Server 1:
select from Own
----+-----+------+-----+-----
#   |@RID |@CLASS|out  |in   
----+-----+------+-----+-----
0   |#51:0|Own   |#32:0|#28:0
----+-----+------+-----+-----

select from Post
----+-----+------+-------+---------
#   |@RID |@CLASS|content|timestamp
----+-----+------+-------+---------
0   |#28:0|Post  |test   |1         <-- NO LINK
----+-----+------+-------+---------

select from #32:0
----+-----+------+-------------+------------+----------+----------------
#   |@RID |@CLASS|user_cn      |user_country|user_flags|user_last_active
----+-----+------+-------------+------------+----------+----------------
0   |#32:0|User  |xxxxx        |CH          |262168    |1446056764        <-- NO LINK
----+-----+------+-------------+------------+----------+----------------

Server 2:
select from Own
0 item(s) found. Query executed in 0.002 sec(s). <-- NO RECORD

select from Post
----+-----+------+-------+---------
#   |@RID |@CLASS|content|timestamp
----+-----+------+-------+---------
0   |#28:0|Post  |test   |1         <-- NO LINK
----+-----+------+-------+---------

select from #32:0
----+-----+------+-------------+------------+----------+----------------
#   |@RID |@CLASS|user_cn      |user_country|user_flags|user_last_active
----+-----+------+-------------+------------+----------+----------------
0   |#32:0|User  |xxxxx        |CH          |262168    |1446056764        <-- NO LINK
----+-----+------+-------------+------------+----------+----------------
@lvca
Copy link
Member

lvca commented Oct 30, 2015

Which OrientDB release?

@nethertek
Copy link
Author

2.1.3, 2.1.4 & 2.1.x branch

@lvca
Copy link
Member

lvca commented Nov 3, 2015

I just created this test case (c5250ca) and everything seems fine. Please could you take a look at the differences with your use case?

@nethertek
Copy link
Author

The difference lies in executionMode asynchronous. When in asynchronous mode, it does not work.

@nethertek
Copy link
Author

Are you able to reproduce in asynchronous mode?

@lvca
Copy link
Member

lvca commented Nov 20, 2015

Reproduced the error in async mode. Working on it.

lvca added a commit that referenced this issue Nov 20, 2015
@lvca
Copy link
Member

lvca commented Nov 20, 2015

This problem was the class OConcurrentModificationException happened during the async call, so not possible to caught.

Starting from last v2.1.6-SNAPSHOT is possible to catch events of command during asynchronous replication, thanks to the following method of OCommandSQL:

  • onAsyncReplicationOk(), to catch the event when the asynchronous replication succeed
  • onAsyncReplicationError(), to catch the event when the asynchronous replication returns error

Example retrying up to 3 times in case of concurrent modification exception on creation of edges:

g.command( new OCommandSQL("create edge Own from (select from User) to (select from Post)")
 .onAsyncReplicationError(new OAsyncReplicationError() {
  @Override
  public ACTION onAsyncReplicationError(Throwable iException, int iRetry) {
    System.err.println("Error, retrying...");
    return iException instanceof ONeedRetryException && iRetry<=3 ? ACTION.RETRY : ACTION.IGNORE;
  }
})
 .onAsyncReplicationError(new OAsyncReplicationOk() {
   System.out.println("OK");
 }
).execute();

For more information: https://github.com/orientechnologies/orientdb-docs/blob/master/Distributed-Configuration.md#asynchronous-replication-mode.

@nethertek
Copy link
Author

This still doesn't work reliably.

Edges are created and replicated, but the links are not created/updated.

@lvca
Copy link
Member

lvca commented Nov 23, 2015

@nethertek what do you mean?

It's normal in an optimistic database that you can have concurrent modification exception. All you need to do is handle them correctly. While with synchronous distributed tx it's easier, with asynchronous you can do that with this API.

@nethertek
Copy link
Author

I mean that the current distributed implementation is flawed and it doesn't work.

The binary interface does not throw any exception, so to our application it has been accepted and executed, but in reality it was not. Running the query again does not solve anything. Instead all replication seems to stop.

@lvca
Copy link
Member

lvca commented Nov 24, 2015

This isn't supported through binary protocol yet but works in embedded mode. We could support this new api quickly in sql batch execution in case of async replication. Is using sql batch ok for your case?

@lvca lvca reopened this Nov 24, 2015
@lvca lvca modified the milestones: 2.1.6, 2.1.x (next hotfix) Nov 24, 2015
@nethertek
Copy link
Author

IMHO, this process should not be delegated to the application, but rather handled by the database itself.

@lvca
Copy link
Member

lvca commented May 12, 2016

Look at: https://github.com/orientechnologies/orientdb-docs/blob/master/Concurrency.md#how-does-it-work why in those cases it's up to the application to manage this situation.

@lvca lvca added question and removed bug labels May 12, 2016
@nethertek
Copy link
Author

We removed OrientDB from our project because of such issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

5 participants