-
Notifications
You must be signed in to change notification settings - Fork 872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instability using tinkerpop remote client #4131
Comments
Have you tied upgrading to a more recent hotfix version, like 2.0.8? |
We haven't tried upgrading yet. Would you recommend we upgrade both the server and client, or would just the client be sufficient? |
Upgrade both. |
@nmcintyre2014 you should have some additional stack traces in the log that should show in a better way the main cause of the error, by the way seems just a network problem. |
Hey guys, highly doubtful that this is a network problem as the client and server are both within the same VPC in AWS. We'll upgrade and provide more info. |
We are on 2.0.9 both client and server. Here is the stack trace we are getting on the client. Can you help explain what this means? 23:06:54.947 [http-bio-8080-exec-186] INFO i.g.p.c.RecordDataAccessObject - Executing SQL: SELECT * FROM [#18:65064,#18:5378,#18:15898,#18:58784,#18:10246,#18:3903,#18:14454,#18:79091,#18:8553,#18:2592,#18:13020,#18:67273,#18:7441,#18:8552,#18:57210,#18:79090,#18:11414,#18:13019,#18:7440,#18:67326,#18:14453,#18:3902,#18:65814,#18:105459,#18:106961,#18:108260,#18:109952,#18:121149,#18:127364,#18:129255,#18:129862,#18:131945,#18:132377,#18:133199,#18:133244,#18:134694,#18:136696,#18:137800,#18:139132,#18:139285,#18:140698,#18:110030,#18:108259,#18:131213,#18:122125,#18:6027,#18:16603,#18:137983,#18:141568,#17:1025504,#17:1014347,#17:1011254,#17:592589,#17:2404727,#17:2412061,#17:3195758,#17:3195799,#17:1054905,#17:1028615,#17:1014349,#17:832401,#17:981067,#17:600239,#17:831650,#17:1405913,#17:1741277,#17:763941,#17:876922,#17:1026578,#17:2415403,#17:1015497,#17:763840,#17:1016811,#17:1028018,#17:1490843,#17:1016845,#17:2415581,#17:784855,#17:2194143,#17:1961008,#17:994155,#17:918756,#17:786886,#17:1147297,#17:1055624,#17:1741416,#17:2192763,#17:1543173,#17:2647982,#17:3282029,#17:3282092,#17:3283731,#17:3283892,#17:50306,#18:295,#18:1572,] |
Hi @nricklin, |
Here's the entire class. It's a spring singleton and the init method is called once, upon application startup. package io.geobigdata.persistence.orient;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.tinkerpop.blueprints.impls.orient.OrientGraph;
import com.tinkerpop.blueprints.impls.orient.OrientGraphFactory;
import com.tinkerpop.blueprints.impls.orient.OrientGraphNoTx;
public class OrientDbClientProvider {
private static final Logger LOG = LoggerFactory.getLogger(OrientDbClientProvider.class);
private String databaseUrl = null;
private String user = null;
private String password = null;
private OrientGraphFactory orientGraphFactory = null;
private Boolean memory = Boolean.FALSE;
public OrientDbClientProvider(String dbURL) {
this.setDatabaseUrl(dbURL);
init();
}
public OrientDbClientProvider(){
}
public void init(){
if(databaseUrl.startsWith("memory")){
memory = Boolean.TRUE;
return;
}
// create the pool
if (getUser()==null) {
orientGraphFactory = new OrientGraphFactory(databaseUrl).setupPool(1, 50);
} else {
orientGraphFactory = new OrientGraphFactory(databaseUrl,getUser(), getPassword()).setupPool(1, 50);
}
}
// Standard transaction handle
public OrientGraph getOrientGraph(){
if(memory){
return new OrientGraph(databaseUrl);
} else {
LOG.debug("Getting orient graph. "+
orientGraphFactory.getAvailableInstancesInPool()+" instances available. "+
orientGraphFactory.getCreatedInstancesInPool()+" instances created.");
return orientGraphFactory.getTx();
}
}
public String getDatabaseUrl() {
return databaseUrl;
}
public void setDatabaseUrl(String databaseUrl) {
this.databaseUrl = databaseUrl;
}
public String getUser() {
return user;
}
public void setUser(String user) {
this.user = user;
}
public String getPassword() {
return password;
}
public void setPassword(String password) {
this.password = password;
}
} |
We see this sort of error under extremely high load. |
Ok, Thank you for the info, the OrientDbClientProvider look fine, i'll investigate inside the pool implementation to be sure that is everything right also under heavy load. |
hi @nmcintyre2014, i checked the code and also ran a small concurrency test to check the pool with 10000 threads and 1M of tasks, but i wasn't able to reproduce the problem, the only thing i found that could cause this error was if was call by hand an 'open' from a database taken from the pool, i added some check for avoid that it will be out with the 2.0.10, in the meanwhile can you check if you call some open on a database from the pool? thank you |
Issue closed because no answer. In case, please reopen/comment this issue. |
We are definitely seeing a similar issue (IllegalStateException thrown from the same spot), rarely and irreproducibly although it also seems to happen under high load. I've verified that we are properly calling shutdown() within a finally block on every graph usage and that this is properly calling close() on the underlying database. I do have a theory about what may be going on but I don't have the resources or understanding of the code to check it out. The database object in question is an instance of OPartitionedDatabasePool$DatabaseDocumentTxPolled. When close() is called It has a "status" field which is set to CLOSED before the db is returned to the pool, which is implemented as a Queue. What I think might be happening is that under load the pool is becoming empty. So if a second thread comes in and attempts to acquire the database right after it has been added, the first thread's update to the status field may not yet be visible. In short, I think the status field needs to be marked volatile. |
It violates JMM but I think we need investigate why update to database are not visible |
The issue has been closed because no response has been received after a while. In case the issue is still valid, please reopen it or write a comment that explains why it's still valid and a way to reproduce it. Thank you. |
Hi seems I found the reason if remote storage is closed when you call close on database instance then status of database instance itself is not updated as closed, so you put open database in pool. I update code accordingly and push the change if you still be able to reproduce issue please reopen it with updated stack trace. |
(cherry picked from commit 1ab0fbe)
(cherry picked from commit 1ab0fbe)
We're experiencing some instability in our orient remote client application. We're not sure of the root cause, but when it happens, our application is unable to connect to orientdb. The orientdb server seems healthy. The excpetion reported in our logs is: Caused by: com.orientechnologies.orient.core.exception.OStorageException: Error on executing command: sql.SELECT * FROM LandsatAcquisition WHERE timestamp >= date('2015-05-10 17:41:02.000','yyyy-MM-dd HH:mm:ss.SSS')
at com.orientechnologies.orient.client.remote.OStorageRemote.handleException(OStorageRemote.java:1501) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:1059) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.client.remote.OStorageRemoteThread.command(OStorageRemoteThread.java:436) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63) ~[orientdb-core-2.0.1.jar:2.0.1]
at com.tinkerpop.blueprints.impls.orient.OrientGraphCommand.execute(OrientGraphCommand.java:48) ~[orientdb-graphdb-2.0.1.jar:2.0.1]
at io.geobigdata.persistence.catalog.RecordDataAccessObject.executeTypedSearch(RecordDataAccessObject.java:285) ~[persistence-catalog-1.0-SNAPSHOT.jar:na]
at io.geobigdata.persistence.catalog.RecordDataAccessObject.search(RecordDataAccessObject.java:412) ~[persistence-catalog-1.0-SNAPSHOT.jar:na]
at io.geobigdata.services.catalog.SearchService.search(SearchService.java:34) ~[SearchService.class:na]
at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_75]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_75]
at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:181) ~[cxf-core-3.0.3.jar:3.0.3]
at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:97) ~[cxf-core-3.0.3.jar:3.0.3]
... 36 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: -1
at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_75]
at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_75]
at com.orientechnologies.orient.client.remote.OStorageRemote.useNewServerURL(OStorageRemote.java:1687) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.client.remote.OStorageRemote.getAvailableNetwork(OStorageRemote.java:1840) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.client.remote.OStorageRemote.beginRequest(OStorageRemote.java:1806) ~[orientdb-client-2.0.1.jar:2.0.1]
at com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:970) ~[orientdb-client-2.0.1.jar:2.0.1]Any suggestions as to what is causing this?
The text was updated successfully, but these errors were encountered: