-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory usage during backup of big disk images #8
Comments
hi, any exception shown from the program? Do you have the latest output? Seems like something is not correctly freed. Either its the big number of extents that the vm had (the extent array is kept in memory and processed multiple times) or there is some other location that first reads into memory before writing .. The requests to the NBD server are limited to 32 MB so it cant be the reading / writing from the NBD server during backup.. |
no exceptions, progress bar stopped at ~116Gb and whole server freezed because of OOM
attached log, the same VM, --verbose, Ctrl-C when at ~10Gb of RAM consumed |
Thanks.. the number of extents doesnt seem to be the problem:
thats just a few kilobytes. Its already backing up data.. and past the point.
so it seems somewhere in the code that is reading/writing to the filehandle somethings not beeing freed correctly, but Im not really sure if opening the targetfile without buffer would help, can you try the change from branch nobuffer: https://github.com/abbbi/virtnbdbackup/compare/nobuffer maybe it helps. Currently i dont have a test environment with such a big disk going. What would probably really help to analyze the issue is to run virtnbdbackup using python3's memory_profiler to |
I'm experimenting with muppy, got:
can't find a way to get variable names in muppy, any suggestions ?`` |
thanks, that doesnt look too bad for me,.. nowhere near 10 GB of memory and at the point it is OOM'ing Maybe it would help to use memory_profile, it would allow to profile only the backupDisk function. Does the change from the nobuffer branch help (even if i think it is unlikely to) |
|
hm... i dont know why.. at the point you have added the profiling the extents have already been processed and If the extent handler is at fault, you would also be able to reproduce the OOM with the |
memory_profiler on small VM (~3Gb) |
thanks.. Actually it is incrementing here:
|
Can you reproduce it in your environment ? If no, then it could be libnbd issue, for example |
|
|
hi again, ive got a new test setup with a virtual machine that has only 2 GB of ram and a virtual disk with 21 GB of used data,
also, looking at the debug output, i could see that there were quite big blocks processed during the backup. This one would have ended up in the chunked read function:
which at its peak did also not exceed any memory limit:
so either this is an issue with the libnbd version you are using, or as we had some discussion about the There are some refrences to memory leaks in the python related libnbd code:
but they are in version Version 1.5.2., not sure if they have been backported to the centos version. The other theory would be that the memory increases the more often the backup process has to read data. My example has only 25 extents as the disk is not really fragmented, your disk does have 22k extents to backup. |
thanks a lot, going to try more recent libnbd version |
ive checked the centos 8 source rpms for libnbd and they have indeed backported the fix for the pread function,
which fixes exactly what you are facing:
so its pretty clear to me this issue happens bcs you are using an older libnbd version without that fix applied. |
Yes, after libnbd upgrade everything went fine |
Hi
noticed OOM on big amount of data
Apr 28 23:17:04 d02l kernel: Tasks state (memory values in pages):
Apr 28 23:17:04 d02l kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
...
Apr 28 23:17:05 d02l kernel: [ 378468] 0 378468 28402621 27067014 227512320 1300054 0 virtnbdbackup
...
Apr 28 23:17:05 d02l kernel: Out of memory: Killed process 378468 (virtnbdbackup) total-vm:113610484kB, anon-rss:108264172kB, file-rss:3884kB, shmem-rss:0kB, UID:0 pgtables:222180kB oom_score_adj:0
It was a VM with 150+ Gb of data to backup
Can you reproduce it in your environment ?
Have --verbose log (another run, hit Ctrl-C at ~10+ Gb VSZ) if needed
The text was updated successfully, but these errors were encountered: