Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Issues with 32 Bit java #7586

Closed
EricSchreiner opened this issue Jul 26, 2017 · 49 comments
Closed

Memory Issues with 32 Bit java #7586

EricSchreiner opened this issue Jul 26, 2017 · 49 comments
Labels
Milestone

Comments

@EricSchreiner
Copy link

OrientDB Version: 2.2.23

Java Version: 1.8.0_131 32Bit

OS: Windows 10

Hi @lvca ,

When I use your recomended settings with a 32 Bit (1.8.0_131) Runtime I get the out of memory immmediately (with -XX:MaxDirectMemorySize=128m it just comes later)

Here are the relevant settings:
VER @ 10:42:50.358 java.runtime.version: 1.8.0_131-b11
VER @ 10:42:50.358 java.version: 1.8.0_131
VER @ 10:42:50.358 java.vm.version: 25.131-b11
VER @ 10:42:50.358 java.vm.vendor: Oracle Corporation
VER @ 10:42:50.358 java.vm.name: Java HotSpot(TM) Client VM
VER @ 10:42:50.358 java.specification.version: 1.8
VER @ 10:42:50.358 java.vm.specification.version: 1.8
VER @ 10:42:50.359 os.name: Windows 10
VER @ 10:42:50.359 os.version: 10.0
VER @ 10:42:50.359 os.arch: x86
MSG @ 10:42:50.359 java.runtime totalMemory=16mb maxMemory=1037mb freeMemory=11mb processors=8
MSG @ 10:42:50.361 java.runtime.argument: -Xmx1024m
MSG @ 10:42:50.361 java.runtime.argument: -XX:MaxDirectMemorySize=1G
MSG @ 10:42:50.361 java.runtime.argument: -Dpicapport.home=C:\ProgramData\Contecon
MSG @ 10:42:50.361 java.runtime.argument: -DTRACE=DEBUG

Here is the error during the start database
MSG @ 10:42:52.372 PicApportDBService.createDatabaseDirectory: C:\Users\Eric.picapport\db
MSG @ 10:42:52.373 PicApportDBService.startDatabase:plocal:C:/Users/Eric/.picapport/db/db.2.2.23
EXCEP@ ============================================================
EXCEP@ Exception at: 2017-07-26 10:42:52
EXCEP@ Msg:
EXCEP@ null
EXCEP@ ------------------------------------------------------------
EXCEP@ java.lang.OutOfMemoryError
EXCEP@ at sun.misc.Unsafe.allocateMemory(Native Method)
EXCEP@ at java.nio.DirectByteBuffer.(DirectByteBuffer.java:127)
EXCEP@ at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
EXCEP@ at com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:328)
EXCEP@ at com.orientechnologies.common.directmemory.OByteBufferPool.acquireDirect(OByteBufferPool.java:279)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.load(OWOWCache.java:769)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.updateCache(O2QCache.java:1107)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.doLoad(O2QCache.java:346)
EXCEP@ at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.allocateNewPage(O2QCache.java:397)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperation.commitChanges(OAtomicOperation.java:434)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:468)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:412)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.endAtomicOperation(ODurableComponent.java:116)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.create(OPaginatedCluster.java:195)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.addClusterInternal(OAbstractPaginatedStorage.java:4136)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doAddCluster(OAbstractPaginatedStorage.java:4117)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.create(OAbstractPaginatedStorage.java:459)
EXCEP@ at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.create(OLocalPaginatedStorage.java:127)
EXCEP@ at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.create(ODatabaseDocumentTx.java:438)
EXCEP@ at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.create(ODatabaseDocumentTx.java:398)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.createDBSchema(Unknown Source)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.db.PicApportDBService.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.startDatabase(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.init(Unknown Source)
EXCEP@ at de.contecon.picapport.PicApport.main(Unknown Source)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
EXCEP@ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
EXCEP@ at java.lang.reflect.Method.invoke(Method.java:498)
EXCEP@ at com.sun.javafx.application.LauncherImpl.launchApplicationWithArgs(LauncherImpl.java:389)
EXCEP@ at com.sun.javafx.application.LauncherImpl.launchApplication(LauncherImpl.java:328)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
EXCEP@ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
EXCEP@ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
EXCEP@ at java.lang.reflect.Method.invoke(Method.java:498)
EXCEP@ at sun.launcher.LauncherHelper$FXHelper.main(LauncherHelper.java:767)

@andrii0lomakin
Copy link
Member

Hi @EricSchreiner what is your disk cache size?

@EricSchreiner
Copy link
Author

Hi @Laa
these are the parameters we use. Everything else is default


 Map defaultsMap=new HashMap<String, Object>();    
    defaultsMap.put("storage.keepOpen", false);     // Tells to the engine to not close the storage when a database is closed. Storages will be closed when the process will shutdown
    
    defaultsMap.put("tx.useLog", true);             // Transactions use log file to store temporary data to being rolled back in case of crash    
    defaultsMap.put("tx.log.synch", true);          // Executes a synch against the file-system for each log entry. This slows down transactions but guarantee transaction reliability on non-reliable drives
    defaultsMap.put("tx.commit.synch", true);       // Synchronizes the storage after transaction commit (see Disable the disk synch)
    
    
    defaultsMap.put("cache.level1.enabled", false); 
    defaultsMap.put("cache.level1.size", 0);
    // ES removed Feb 2015 seit ODB 2.0.0 nicht mehr nötig defaultsMap.put("cache.level2.enabled", false);
    // ES removed Feb 2015 seit ODB 2.0.0 nicht mehr nötig defaultsMap.put("cache.level2.size", 0);
    
    defaultsMap.put("nonTX.recordUpdate.synch", true);               // Executes a synch against the file-system at every record operation. This slows down records updates but guarantee reliability on unreliable drives
    defaultsMap.put("index.auto.rebuildAfterNotSoftClose", true);    // Auto rebuild all automatic indexes after upon database open when wasn't closed properly
    defaultsMap.put("mvrbtree.lazyUpdates", 1);                      // -1=Auto, 0=always lazy until explicit lazySave() is called by application, 1=No lazy, commit at each change. >1=Commit at every X changes
    
    OGlobalConfiguration.setConfiguration(defaultsMap); 

@andrii0lomakin
Copy link
Member

@EricSchreiner I suppose it means that you have 4GB disk cache. Which is above of capabilities of 32 JVM. Also, I strongly do not recommend to disable first level cache

About your settings:

defaultsMap.put("cache.level1.enabled", false); 
defaultsMap.put("cache.level1.size", 0)

it may cause a lot of strange exceptions in your application.

 defaultsMap.put("mvrbtree.lazyUpdates", 1); 

mvrbtree is removed long time ago from distribution and this parameter is not needed.

defaultsMap.put("nonTX.recordUpdate.synch", true);
defaultsMap.put("tx.commit.synch", true);

is the legacy of 1.x version of the implementation of txs and not used any more. So you can remove them too.

defaultsMap.put("tx.useLog", true);

is always true and can not be changed, even if you directly set it to false. So this parameter can be removed too.

defaultsMap.put("storage.keepOpen", false);     // Tells to the engine to not close the storage when a database is closed. Storages will be closed when the process will shutdown

is not valid anymore, this parameter is always true and can not be changed. The same for defaultsMap.put("tx.log.synch", true) . It is always true and can not be changed so you can remove it from a map.

Back to your main issue. I suggest you set com.orientechnologies.orient.core.config.OGlobalConfiguration#DISK_CACHE_SIZE to 800 (it means 800 MB) keep -XX:MaxDirectMemorySize=1G and I do suggest you set DISK_CACHE_SIZE parameter directly not through OGlobalConfiguration.setConfiguration(defaultsMap) call.

P.S. BTW what is your expected DB size, according to tests which you already performed do you already have some expectations. By DB size, I mean size on disk in GBs or MBs?

@EricSchreiner
Copy link
Author

Hi @Laa
thanks for your reply. Does your answer mean that I should remove all settings with the exception of defaultsMap.put("index.auto.rebuildAfterNotSoftClose", true);?

You wrote I should reduce DISK_CACHE_SIZE to 800m. Is this related to XX:MaxDirectMemorySize? If yes how? Can I set DISK_CACHE_SIZE and XX:MaxDirectMemorySize to 128mb?

For your understanding: We have thousands of users runnig PicApport not having the XX:MaxDirectMemorySize set. Lots of them are using a RaspberyPI with just one Gig of physical memory.
So in the past we recommend to set -Xmx512m for 32 BIt Installation and RaspberryPI what works fine with serveral thousand photos(we tested with 6000). What i like to achieve is that this will still work with our new version with Orient 2.2.xx because I expect a lot of our users will not read our release notes. (We also have created a .exe file with a Windows-Installer for complete unexperienced users who I cannot ask to set any parameter) And again I do not care about speed in these low memory situations it shoud just work.

To answer your questions. My test database contains about 50.000 Photos (metadata and thumbnails) .
The total size of the database directory is 880mb
dbconfig.txt

I also have a test system with one million Photos (for this we have a 64 Bit engine but I have not tried it yet with V2.2.xx)

@andrii0lomakin
Copy link
Member

@EricSchreiner I see I suppose I can help you to run a database without -xx:maxdirectmemory set. But now we are busy. I will be back to this issue on next Tuesday.

@EricSchreiner
Copy link
Author

OK

@taburet
Copy link
Contributor

taburet commented Jul 31, 2017

Hi @EricSchreiner,

There are basically only three options that affect/limit the memory usage of OrientDB:

  1. -Xmx limits the heap size, as we all know. Usually, if not provided, it's auto-configured by JVM to some reasonable default. May be configured only from the JVM args, OrientDB can't control it.

  2. -XX:MaxDirectMemorySize limits the amount of the off-heap "direct" memory JVM may allocate. Usually, if not provided, it's auto-configured by JVM to the value of -Xmx. May be configured only from the JVM args, OrientDB can't control it.

  3. -Dstorage.diskCache.bufferSize, aka OGlobalConfiguration#DISK_CACHE_SIZE, limits the disk cache size of OrientDB. Auto-configured by OrientDB to the value of Xmx if XX:MaxDirectMemorySize is not provided, otherwise it's configured to max(machine_memory_size - Xmx - 2GB, 256MB) and upper-limited to the value of XX:MaxDirectMemorySize. Minimum supported value is 64MB. Note, that is not a hard limit, if the disk cache is full and non of its memory can be freed, the so called small overflow buffers will be allocated. Setting the disk cache size to extremely low values while performing huge queries will not help, especially in case of update/insert queries.

The disk cache allocates memory from JVM's off-heap "direct" memory. So to avoid OOMs DISK_CACHE_SIZE <= XX:MaxDirectMemorySize inequality must always hold and Xmx + DISK_CACHE_SIZE + memory_reserved_by_os_and_other_processes <= machine_memory_size must also hold.

Regarding your test box with less than 2GB of RAM mentioned in the emails. Try to set Xmx to 512MB and remove all other options. -XX:MaxDirectMemorySize will be auto-configured to 512MB by JVM, DISK_CACHE_SIZE will be auto-configured to 512MB by OrientDB. Total memory consumption of OrientDB must be around 1GB, that should leave enough RAM to the OS and other processes. But still it's better to have -XX:MaxDirectMemorySize and DISK_CACHE_SIZE set to explicit values according the the aforementioned inequalities.

In case of 1GB RaspberyPI box with already configured Xmx of 512MB and neither of -XX:MaxDirectMemorySize or DISK_CACHE_SIZE set, this means that OrientDB may eat up to 1GB of RAM. That is too much for the box. I may tune the DISK_CACHE_SIZE auto-configuration procedure to adjust for low memory conditions, but there still will be a problem if Xmx set so high that there is no RAM left the disk cache. What is the typical Xmx of your RaspberyPI users?

@EricSchreiner
Copy link
Author

Hi @taburet,
thanks for your answer. I'll check and come back to you.
In between: Is it possible that max(machine_memory_size - Xmx - 2GB, 256MB) does not work if we execute OrientDB in a 32-Bit VM on a Computer that has more than 16gig of RAM? What would machine_memory_size be in a 32-Bit Environment with a PC with 16 Gig of Ram?
Are you using: machine_memory_size = os.getTotalPhysicalMemorySize();

@taburet
Copy link
Contributor

taburet commented Aug 3, 2017

Is it possible that max(machine_memory_size - Xmx - 2GB, 256MB) does not work if we execute OrientDB in a 32-Bit VM on a Computer that has more than 16gig of RAM?

It should work, AFAIU, but in a wrong way :) Why it may behave differently specifically at 16GB?

What would machine_memory_size be in a 32-Bit Environment with a PC with 16 Gig of Ram?

Seems like it will be 16GB and that may be a problem. Will check this.

Are you using: machine_memory_size = os.getTotalPhysicalMemorySize();

Yes, exactly.

@taburet
Copy link
Contributor

taburet commented Aug 7, 2017

@EricSchreiner did you see messages like "32 bit JVM is detected. Lowering disk cache size from X to Y" in the logs?

@EricSchreiner
Copy link
Author

Hi @taburet
no we don't see messages like 32 Bit JVM detected. I've attached a logfile that contains a configuration dump
PicApport-32Bit.txt

@EricSchreiner
Copy link
Author

@taburet one more thing about why using 32-Bit on a machine with 16 Gig of Ram. Well this is our Test-environment. Also my laptop I use for testing has 32 gig of RAM and I also need to test installations I have received from users .

@taburet
Copy link
Contributor

taburet commented Aug 7, 2017

@EricSchreiner yes, I understand your needs. The strange thing is that according to the provided log file there is no auto-configuration done on OrientDB side at all, but it must be done, sine disk cache is not configured. I will investigate more on this.

andrii0lomakin added a commit that referenced this issue Aug 7, 2017
@andrii0lomakin
Copy link
Member

@EricSchreiner
Copy link
Author

HI @Laa

see logfile....
orient-server.log.3.txt

@andrii0lomakin
Copy link
Member

Hi @EricSchreiner @robfrank as I can see Lucene exception was reproduced the same as in issue #7585 which @EricSchreiner is referenced. I will mark this issue as blocked till issue #7585 will be resolved. Actually I suppose that OOM is fixed and it allows to reproduce Lucene issue but we need to be 100% sure.

@EricSchreiner
Copy link
Author

Hi @Laa , @robfrank is it possible that I need another orientdb-spatial-2.2.23-dist.jar? If yes where will I get the orientdb-spatial-2.2.26-dist.jar?

@robfrank
Copy link
Contributor

@EricSchreiner the problem referenced in #7585 is solved from 2.2.25. I supposed you updated to latest 2.2.25. So please take the latest snapshot of spatial as well.

@EricSchreiner
Copy link
Author

@Laa now the log with orientdb-spatial-2.2.26-20170810.160653-15.jar
orient-server.log.2.txt

@andrii0lomakin
Copy link
Member

So Lucene issue still persist . Will wait for fix .

@EricSchreiner
Copy link
Author

Hi @Laa, hi @robfrank
any news on this?

@EricSchreiner
Copy link
Author

Hi @Laa, hi @robfrank,
I've tested 2.2.26 GA. looks much better :-) I'll continue testing tomorrow
Logfile:
orient-server.log.txt
Config:
PicApport-2.2.26.txt

@andrii0lomakin
Copy link
Member

@EricSchreiner ok so probably it was just an issue with a mix of libraries of different versions, I am waiting for your final conclusion.

@andrii0lomakin
Copy link
Member

Hi @EricSchreiner any update on this?

@EricSchreiner
Copy link
Author

Hi @Laa seems to be OK so far. I've attached two logfiles from the same database started with 32Bit and 64Bit. The only thing I see is, that sometimes it takes a very long time to shutdown the database. This seem to be new.
orient-server-64bit.log.0.txt
orient-server-32bit.log.0.txt

@andrii0lomakin
Copy link
Member

@EricSchreiner cool. What do you mean by takes too long time to shut down? Does it take on both instances or on 32 bit only?

@andrii0lomakin
Copy link
Member

andrii0lomakin commented Aug 22, 2017

@EricSchreiner I will close this issue because seems like it is fixed. But please open a new issue if you think something is wrong with the shutdown, may be it is a bug may be not let see. If you will be able to create profiler snapshot it will be cool if not we will provide instructions for very good and free one, but of course without handy GUI.

@andrii0lomakin
Copy link
Member

@santo-it for release notes: "On 32 bit systems because the high level of memory fragmentation ODB can not allocate memory by big chunks, so it always allocates memory with page-size granularity. It will decrease performance but will avoid throwing of OOM in case of allocation of direct memory".

@EricSchreiner
Copy link
Author

@Laa thank you for your support......

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

6 participants