Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server error on "select from Class group by field" with 1,000,000+ records #3144

Closed
RCNTEC opened this issue Dec 8, 2014 · 10 comments
Closed
Assignees

Comments

@RCNTEC
Copy link

RCNTEC commented Dec 8, 2014

Hi!
I have a class containing 1,000,000+ records with the following properties start_of_life, end_of_life, hist_id. hist_id is shared among different versions of a record. The following query "select min(start_of_life), max(end_of_life), hist_id from SomeVertex group by hist_id" causes the server to throw an exception (SEVERE Internal server error: java.net.SocketException: Socket closed [ONetworkProtocolHttpDb])

Orientdb versions: 2.0-M2, 2.0-M3

Any ideas or advice?

@lvca
Copy link
Member

lvca commented Dec 8, 2014

Is it possible it goes in timeout? After how much time it returns the error? Or this could be a RAM problem where order by could be RAM expensive. What's your cfg?

@lvca lvca self-assigned this Dec 8, 2014
@lvca lvca added the question label Dec 8, 2014
@RCNTEC
Copy link
Author

RCNTEC commented Dec 9, 2014

It took around 60 secs before the server returned the error. My config file has the default settings. Here it is:

<orient-server>
    <handlers>
        <handler class="com.orientechnologies.orient.graph.handler.OGraphServerHandler">
            <parameters>
                <parameter value="true" name="enabled"/>
                <parameter value="50" name="graph.pool.max"/>
            </parameters>
        </handler>
        <handler class="com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin">
            <parameters>
                <parameter value="${distributed}" name="enabled"/>
                <parameter value="${ORIENTDB_HOME}/config/default-distributed-db-config.json" name="configuration.db.default"/>
                <parameter value="${ORIENTDB_HOME}/config/hazelcast.xml" name="configuration.hazelcast"/>
                <parameter value="com.orientechnologies.orient.server.distributed.conflict.ODefaultReplicationConflictResolver" name="conflict.resolver.impl"/>
            </parameters>
        </handler>
        <handler class="com.orientechnologies.orient.server.handler.OJMXPlugin">
            <parameters>
                <parameter value="false" name="enabled"/>
                <parameter value="true" name="profilerManaged"/>
            </parameters>
        </handler>
        <handler class="com.orientechnologies.orient.server.handler.OAutomaticBackup">
            <parameters>
                <parameter value="false" name="enabled"/>
                <parameter value="4h" name="delay"/>
                <parameter value="backup" name="target.directory"/>
                <parameter value="${DBNAME}-${DATE:yyyyMMddHHmmss}.zip" name="target.fileName"/>
                <parameter value="9" name="compressionLevel"/>
                <parameter value="1048576" name="bufferSize"/>
                <parameter value="" name="db.include"/>
                <parameter value="" name="db.exclude"/>
            </parameters>
        </handler>
        <handler class="com.orientechnologies.orient.server.handler.OServerSideScriptInterpreter">
            <parameters>
                <parameter value="true" name="enabled"/>
                <parameter value="SQL" name="allowedLanguages"/>
            </parameters>
        </handler>
    </handlers>
    <network>
        <sockets>
            <socket implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory" name="ssl">
                <parameters>
                    <parameter value="false" name="network.ssl.clientAuth"/>
                    <parameter value="config/cert/orientdb.ks" name="network.ssl.keyStore"/>
                    <parameter value="password" name="network.ssl.keyStorePassword"/>
                    <parameter value="config/cert/orientdb.ks" name="network.ssl.trustStore"/>
                    <parameter value="password" name="network.ssl.trustStorePassword"/>
</parameters>
            </socket>
            <socket implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory" name="https">
                <parameters>
                    <parameter value="false" name="network.ssl.clientAuth"/>
                    <parameter value="config/cert/orientdb.ks" name="network.ssl.keyStore"/>
                    <parameter value="password" name="network.ssl.keyStorePassword"/>
                    <parameter value="config/cert/orientdb.ks" name="network.ssl.trustStore"/>
                    <parameter value="password" name="network.ssl.trustStorePassword"/>
                </parameters>
            </socket>
        </sockets>
        <protocols>
            <protocol implementation="com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary" name="binary"/>
            <protocol implementation="com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpDb" name="http"/>
        </protocols>
        <listeners>
            <listener protocol="binary" socket="default" port-range="2424-2430" ip-address="0.0.0.0"/>
            <listener protocol="http" socket="default" port-range="2480-2490" ip-address="0.0.0.0">
                <commands>
                    <command implementation="com.orientechnologies.orient.server.network.protocol.http.command.get.OServerCommandGetStaticContent" pattern="GET|www GET|studio/ GET| GET|*.htm GET|*.html GET|*.xml GET|*.jpeg GET|*.jpg GET|*.png GET|*.gif GET|*.js GET|*.css GET|*.swf GET|*.ico GET|*.txt GET|*.otf GET|*.pjs GET|*.svg GET|*.json GET|*.woff GET|*.ttf GET|*.svgz" stateful="false">
                        <parameters>
                            <entry value="Cache-Control: no-cache, no-store, max-age=0, must-revalidate\r\nPragma: no-cache" name="http.cache:*.htm *.html"/>
                            <entry value="Cache-Control: max-age=120" name="http.cache:default"/>
                        </parameters>
                    </command>
                    <command implementation="com.orientechnologies.orient.graph.server.command.OServerCommandGetGephi" pattern="GET|gephi/*" stateful="false"/>
                </commands>
                <parameters>
                    <parameter value="utf-8" name="network.http.charset"/>
                </parameters>
            </listener>
        </listeners>
    </network>
    <storages/>
    <users>
        <user resources="*" password="******" name="root"/>
        <user resources="connect,server.listDatabases,server.dblist" password="guest" name="guest"/>
    </users>
    <properties>
        <entry value="1" name="db.pool.min"/>
        <entry value="50" name="db.pool.max"/>
        <entry value="true" name="profiler.enabled"/>
        <entry value="info" name="log.console.level"/>
        <entry value="fine" name="log.file.level"/>
    </properties>
</orient-server>

I tried to rerun the query while monitoring the RAM state. There were 22Gb of RAM available and again the query ended with the exception:

[OServerPluginManager]Error on fetching record during browsing. The record has been skipped
com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #33:563681 (cluster: actionemployeerelation)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1530)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.loadRecord(OTransactionNoTx.java:79)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1367)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:123)
at com.orientechnologies.orient.core.iterator.OIdentifiableIterator.readCurrentRecord(OIdentifiableIterator.java:278)
at com.orientechnologies.orient.core.iterator.ORecordIteratorClusters.hasNext(ORecordIteratorClusters.java:147)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.fetchFromTarget(OCommandExecutorSQLSelect.java:1226)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.executeSearch(OCommandExecutorSQLSelect.java:409)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.execute(OCommandExecutorSQLSelect.java:367)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:64)
at com.orientechnologies.orient.core.storage.OStorageEmbedded.executeCommand(OStorageEmbedded.java:86)
at com.orientechnologies.orient.core.storage.OStorageEmbedded.command(OStorageEmbedded.java:75)
at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
at com.orientechnologies.orient.server.network.protocol.http.command.post.OServerCommandPostCommand.execute(OServerCommandPostCommand.java:77)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.service(ONetworkProtocolHttpAbstract.java:182)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.execute(ONetworkProtocolHttpAbstract.java:581)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:65)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.WeakHashMap.newTable(Unknown Source)
at java.util.WeakHashMap.(Unknown Source)
at java.util.WeakHashMap.(Unknown Source)
at com.orientechnologies.orient.core.record.ORecordAbstract.(ORecordAbstract.java:62)
at com.orientechnologies.orient.core.record.impl.ODocument.(ODocument.java:99)
at com.orientechnologies.orient.core.record.ORecordFactoryManager$1.newRecord(ORecordFactoryManager.java:55)
at com.orientechnologies.orient.core.record.ORecordFactoryManager.newInstance(ORecordFactoryManager.java:95)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1505)

@RCNTEC RCNTEC closed this as completed Dec 9, 2014
@RCNTEC RCNTEC reopened this Dec 9, 2014
@lvca
Copy link
Member

lvca commented Dec 9, 2014

Please could you try the same query from console against the same database, but open as plocal? In this way you avoid the server and its timeouts.

@RCNTEC
Copy link
Author

RCNTEC commented Dec 9, 2014

I tried what you suggested and got exactly the same error as I posted above.

Error on fetching record during browsing. The record has been skipped
com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #33:569405 (cluster: actionemployeerelation)
.....
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
.....

@lvca
Copy link
Member

lvca commented Dec 9, 2014

Please could you try executing this query?

select set(hist_id).size() from SomeVertex

@RCNTEC
Copy link
Author

RCNTEC commented Dec 9, 2014

It returned the number of distinct hist_ids: 315305

@lvca lvca assigned luigidellaquila and unassigned lvca Dec 13, 2014
@lvca lvca added this to the 2.0 Final milestone Dec 13, 2014
@lvca
Copy link
Member

lvca commented Dec 19, 2014

Please could you retry with 2.0-rc1 or last 2.0-SNAPSHOT?

@lvca lvca assigned lvca and unassigned luigidellaquila Dec 19, 2014
@RCNTEC
Copy link
Author

RCNTEC commented Dec 19, 2014

I tried the query with version 2.0-rc1 and still no success. My first attempt failed with the following exception:

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "OrientDB ONetworkProtocolHttpDb listen at 0.0.0.0:2480-2490"
Exception in thread "OrientDB WAL Flush Task (DirectCostsNew)" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
Error on client connection
java.lang.OutOfMemoryError: Java heap space
        at com.orientechnologies.orient.enterprise.channel.binary.OChannelBinary.<init>(OChannelBinary.java:57)
        at com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryServer.<init>(OChannelBinaryServer.java:36)
        at com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.config(OBinaryNetworkProtocolAbstract.java:99)
        at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.config(ONetworkProtocolBinary.java:126)
        at com.orientechnologies.orient.server.network.OServerNetworkListener.run(OServerNetworkListener.java:228)

Again, I was monitoring the RAM state and there were about 19GB available. I noticed one strange thing. When the server starts it says that the heap size is 481MB:

2014-12-19 19:20:41:176 INFO  OrientDB auto-config DISKCACHE=15,218MB (heap=481MB os=32,115MB disk=30,437MB) [orientechnologies]

But when I stop it with the shutdown script it reports the heap size to be 7,138MB:

INFO: OrientDB auto-config DISKCACHE=22,929MB (heap=7,138MB os=32,115MB disk=30,437MB)

I wonder if I need to configure the heap size manually or the server will increase it automatically?

After this error the server stopped responding and I had to run the shutdown script about 6 times before the server acutally stopped.

Then I restarted the server and retried the query several more times and every time I got the following exception:

Error on fetching record during browsing. The record has been skipped
com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #33:734015 (cluster: actionemployeerelation)
        at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1601)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

Now it takes a bit longer before the error occurs, about 2 minutes.

@lvca
Copy link
Member

lvca commented Dec 19, 2014

Seems you have only half GB in server.sh

@RCNTEC
Copy link
Author

RCNTEC commented Dec 19, 2014

Oh, I increased the heap size and now it works. Thank you very much!

@RCNTEC RCNTEC closed this as completed Dec 19, 2014
@lvca lvca added the wontfix label Dec 19, 2014
@lvca lvca modified the milestones: 2.0 Final, 2.0-rc2 Jan 12, 2015
@lvca lvca modified the milestone: 2.0-rc2 Aug 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants