Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory consumption with large archives #122

Closed
jose1711 opened this issue Nov 18, 2019 · 36 comments
Closed

Memory consumption with large archives #122

jose1711 opened this issue Nov 18, 2019 · 36 comments

Comments

@jose1711
Copy link

My primary system has 8 GB of RAM, rar archive I am mounting has size of more than 350 GB. While rar2fs does not complain, any attempt to list mountpoint or change to it slowly consumes all available ram (w/o touching swap) and ultimately results in coredump.

I'd like to learn if there is a way to overcome this issue e. g. at the expense of speed. Or at least how much memory is needed for large files.

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

Thanks for the issue report.
I have a few follow up questions:

  • When you say RAR archive has more than 350GB, I assume you mean you have many archives that together becomes that, right? Not that you have one (1) archive of that size!?
  • When you mean slowly, do you mean that every time you list the mount point, memory usage increase?
  • What version of rar2fs are you using?
  • Did you experience this problem before or not until recently?
  • Can you please provide the mount options you use?
  • What platform are you running on?

Since this sounds like a memory leak, it is a bug, thus I cannot currently think of any workaround until the root cause is found.

@jose1711
Copy link
Author

* When you say RAR archive has more than 350GB, I assume you mean you have many archives that together becomes that, right? Not that you have one (1) archive of that size!?

no, it really looks like this:

$ du -sh archive.rar
362G	archive.rar
* When you mean slowly, do you mean that every time you list the mount point, memory usage increase?

it's not even possible to finish the list operation. so

rar2fs archive.rar mount
ls mount
# the ls command takes ages and i can observe how the memory consumption
# slowly increases until free memory is depleted and the command fails.
# this also leaves mountpoint broken
* What version of rar2fs are you using?
$ rar2fs -V
rar2fs v1.27.2 (DLL version 8)    Copyright (C) 2009 Hans Beckerus
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it under
certain conditions; see <http://www.gnu.org/licenses/> for details.
FUSE library version: 2.9.9
fusermount version: 2.9.9
using FUSE kernel interface version 7.19
* Did you experience this problem before or not until recently?

I've never had experimented with such large archives so can't really tell whether this might be a regression. Please note that listing the archive contents takes several minutes:

$ time unrar l archive.rar >/dev/null
unrar l archive.rar > /dev/null  3,89s user 11,35s system 3% cpu 6:41,19 total

* Can you please provide the mount options you use?

I used the defaults as you can see above.

* What platform are you running on?

Arch Linux, current

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

Ok, very interesting. I cannot say that specific use-case have been tested before.
If you list this archive using unrar, how many files would you say it contains?
If it takes minutes to list it using unrar I can only guess we are touching a use-case here that has not even been tried by the library itself. We cannot out rule that the problem is in fact in the library rather than rar2fs.

Btw, looks like it takes seconds, not minutes from the output you presented from unrar l?

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

Unless there is some sensitive information in that archive, could you please try to rebuild rar2fs using --enable-debug=5 and try again using the -f switch. We need to get some visuals on what is going on here. Otherwise you can always mail me the log (which will be huge of course).

Also please try v1.28.0, need to understand if this might be related to some of the problems that was discovered with v1.27.2 recently.

@jose1711
Copy link
Author

Ok, very interesting. I cannot say that specific use-case have been tested before.
If you list this archive using unrar, how many files would you say it contains?

approx 170 000 files

Btw, looks like it takes seconds, not minutes from the output you presented from unrar l?

you should be looking at total, so it took 6 minutes 41 seconds.

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

you should be looking at total, so it took 6 minutes 41 seconds.

Yea, my bad.

This is definitively not a use-case that has been tried before. It is also does not sound like a memory leak from where I stand. Something is requiring a lot of resources bound to the number of files.
The problem is that rar2fs does not have the luxury that unrar has that can present things on-the-fly. The library does not allow that and all files need to be collected prior to presentation. Does not seem like an impossibility at least this is what makes the whole thing go south.

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

Ok, did some quick calculations. Just keeping the files in the cache will take ~2GB of memory.
And that is simply the internal file cache in rar2fs. To that you need to add other heap overhead (e.g. mallocs) performed by the library itself and probably something that I have not even considered, like things in fuse/Linux etc. I do not wish to be a party pooper here, but this is probably not even plausible to achieve due to the persistent storage required for 170k files.

@hasse69
Copy link
Owner

hasse69 commented Nov 18, 2019

Maybe, just maybe, we could experiment with some archive scanner that only searches files on a certain directory depth. It would require a lot more processing though so I am not really sure it is worth it. Also I am rather certain the use-case you are putting up here is not one of the more common ones, which of course makes it even more difficult to motivate such a feature. What I do not like is that you get a crash. Is it a OOM killer you get or something else?

EDIT: If it is something else, can you try to catch it in gdb or something so that we can check where exactly it fails. We are supposed to have checks everywhere for memory allocation failures but it might still be missing in a few places.

@hasse69
Copy link
Owner

hasse69 commented Nov 22, 2019

This is where the problem might be

        if (!(n_files = RARListArchiveEx(hdl, next, &dll_result))) {
                RARCloseArchive(hdl);
                pthread_mutex_unlock(&file_access_mutex);
                if (dll_result == ERAR_EOPEN || dll_result == ERAR_END_ARCHIVE)
                        return 0;
                return 1;
        }
...
        while (next) {
...
}

It is not something I can confirm yet, but the pre-collection of all files before processing might not be such a good idea if the archive holds a lot of files, like in this case. I will try to find some time to look at it. I will need your help though to verify if it works or not. The time it takes to extract the information is nothing we can improve though. If it takes ~6(!!) minutes using unrar it is going to take at least that also for rar2fs until the cache is in effect.

@hasse69
Copy link
Owner

hasse69 commented Nov 23, 2019

Please try this patch on master/HEAD and report back.

From the root of the rar2fs repo do:
patch -p1 < issue122_2.patch.txt

issue122_2.patch.txt

EDIT: Note, patch was changed.

@hasse69 hasse69 mentioned this issue Nov 24, 2019
2 tasks
@hasse69
Copy link
Owner

hasse69 commented Nov 24, 2019

Note that this patch is needed so please report back as soon as you have been able to test it. There are some more issues with very large archives as explained in issue #124, But without this patch memory is most likely running out before those problems are going to manifest themselves.

@jose1711
Copy link
Author

Sorry for the delayed response. I was able to mount and list the mountpoint with both patched and unpatched version of rar2fs (1.28.0). In order to do so I rebooted to multi-user target, stopped unnecessary services, disabled swap (so that memory consumption could be more easily determined) and performed the test in text console. At the start I had around 7.1 GB memory free.

Both versions were compiled with debugging enabled and ran with -f switch. Once mounted I ran ls -f in bash shell several times (to see the effects of cache) and measured the time I had to wait for the output. Each test was repeated twice, starting with a freshly booted system. HDD holding the archive is WD Red 3 TB (5400 rpm).

Here are my observations:

  • consumed memory difference between patched and non-patched version was minimal and around 3.8 GB
  • first ls -f with non-patched version: 58.5 minutes, patched: 80.5 minutes (yes, patched version actually performed worse)
  • repeated ls -f of both patched and non-patched: 4-6 seconds
  • ls -l - approx. a minute for both versions
  • copying a single file to /tmp took only couple seconds for both versions
  • unrar l archive.rar >/dev/null (after a reboot) took 11 minutes

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2019

Thanks for the report.

It seems odd that the patched version is that much slower than the original version?
There should not be much of a difference with the exception that the patched version would call a function 170k times (or what ever the amount of files is) more. Sure, it is a bit more expensive but did not expect it to be in this magnitude. What you avoid with the patch is instead a lot of expensive heap allocations that stay resident in memory over a longer period of time since it re-uses the same one-time allocated buffer. You should see a spike in memory usage during an ls operation that later drops with the original version. The patched version should slowly increase memory footprint but never spike and then drop significantly. The observation that memory footprint is the same after the ls operation is what I would expect.

I would appreciate if you could do a few more benchmarks of the patch. Again, it should not perform that bad compared to the original version.

Also, turn off debugging.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2019

Did another benchmark using a recursive ls operation on an archive with ~11k files and that performed 15-20% better with the patch than without. I do not recommend you to do a recursive operation on your archive because that would take days :( But it is interesting to note that performance in this case has improved a lot which is actually what I expected, not the opposite. The cache was not in effect during the benchmark.

@jose1711
Copy link
Author

Did I mention that my archive is flat? There are no subdirs, just files. So ls -lR and ls -l are really equal.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2019

Ok, that is good to know. But it still should not perform worse with the patch. I cannot explain that, especially not since I observe the exact opposite.

@jose1711
Copy link
Author

Thinking about the best way to observe the effect of patch on memory. Can you recommend something?

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2019

What you could try, if not already, is something like htop.
Without the patch you should see a huge increase in memory footprint and then it should drop down to a certain level, lets call that level X. With the patch you should never reach much beyond that of level X, but it should slowly increase until it is reached.

@jose1711
Copy link
Author

jose1711 commented Nov 27, 2019

I used the attached scripts to produce graph of first invocation of ls -f and ls -l commands for both versions. Please note that x-axis shows 2-second samples. Update: benchmarks with disabled debugging will follow.

collect_and_plot.zip

nonpatched_first_ls-f
patched_first_ls-f
nonpatched_ls-l
patched_ls-l

@jose1711
Copy link
Author

The version w/o debugging enabled.
non_patched_ls-f
patched_ls-f
non_patched_ls-l
patched_ls-l

@hasse69
Copy link
Owner

hasse69 commented Nov 27, 2019

Very nice graphs :)

But something seems inconsistent.

In the first runs, it looks like the patched version is using a lot less memory, but then suddenly the patched version starts to show it is using more memory!? Truly do not understand this. Given what the patch does it cannot/should not consume more memory than the non-patched version.
I honestly have no answer to why you see this behavior right now.

@jose1711
Copy link
Author

Memory requirements aside do you have any idea why it takes more time for patched version to do the same job?

@hasse69
Copy link
Owner

hasse69 commented Nov 28, 2019

No sorry, it really makes no sense :( I see the complete opposite effect of the patch here.

I have never seen such a huge flat archives before. Just the fact it takes minutes even for unrar to list the files indicates applying a file system on top of it perhaps is not such a good idea after all.

Maybe the speed difference also boils down to memory requirement?
But again, to me it seems impossible the patched version would require more memory.
It basically removes ~170.000k continuous and persistent heap allocations in your case.
But your graphs contradicts that, because what I would have expected with the non-patched version is a huge raise and then a drop of used memory. But then it also is a matter of if the memory we see in the graph is reserved or actually used resident memory.

@hasse69
Copy link
Owner

hasse69 commented Nov 28, 2019

Btw, was the ls -l always made after ls -f?
Curious because if the X-axis are 2 second samples it looks like we have a memory leak?
It seems to slowly increase, or is it only me?

What happens if you do ls -l repeatedly after cache was populated. Is the memory usage slowly increasing? It certainly looks like it in the graph?

@jose1711
Copy link
Author

| Btw, was the ls -l always made after ls -f?

Yes, it was. Maybe I can run it again like this: date && ls -f; sleep 60; date && ls -l; sleep 60; date && ls -l; sleep 60; date && ls -l sleep 60; date && ls -l while capturing the whole period.

@hasse69
Copy link
Owner

hasse69 commented Nov 28, 2019

Why do you use date in the loop?
I would do something like:
date; ls -f; while true; do sleep 60; date; ls -l; done;
And let it run all night :)

@hasse69
Copy link
Owner

hasse69 commented Nov 28, 2019

Can you please try to mount using the -s (single threaded) flag?
I see very different behavior on my system at least between the two. Without -s it almost looks like we are leaking memory. It can also be imaginary since more threads are spawned by FUSE and they all eat memory since they stay persistent (until some limit) over time even if they are not used.

@jose1711
Copy link
Author

Why do you use date in the loop?
I would do something like:
date; ls -f; while true; do sleep 60; date; ls -l; done;
And let it run all night :)

well the idea was to have a reference points so that we can compare it to the memory report. but sure.. a long running loop is a better idea.

@hasse69
Copy link
Owner

hasse69 commented Nov 29, 2019

I ran this night with -s and there were no growing curve at all, rock solid.
I need to make a run without -s too, the run I did yesterday was too short to be conclusive.

Note that once files has been listed, they are all in the cache. Repeated list operations would only pick data from the cache and thus no memory is really allocated. That is why I am surprised to see that while running without -s seems to cause memory usage to continuously grow it is not the case for single threaded mode.

@hasse69
Copy link
Owner

hasse69 commented Nov 29, 2019

I have executed a bit more extensive tests now and I think we can exclude a memory leak. But it is worth mentioning that running in multi-threaded mode takes (in my tests I need to point out) about 6-7 times more virtual- and about twice as much resident- memory. That is not completely unexpected since the additional threads created and maintained by FUSE does not come for free.

Note that memory overhead in multi-threaded mode is only for system/library internal bookkeeping etc. not really related to rar2fs, The amount of memory required for the file- and directory cache stays the same irrespective of which mode is used.

@jose1711
Copy link
Author

I have run the patched version over night without -s too and I am not seeing any symptoms of leaking memory either. Actually, the memory usage was slightly dropping during the 12 hours of test duration. My suggestion is to close the issue maybe with a note somewhere in the README that archives with huge number of files may be quite demanding on memory when mounting.

@hasse69
Copy link
Owner

hasse69 commented Nov 30, 2019

Did you try with -s too to see if there was any noticeable difference for you?
Also I am still a bit concerned about the crash you got, is this something you can reproduce?

hasse69 pushed a commit that referenced this issue Nov 30, 2019
When scanning an archive for files a linked list it created with all
files and properties before being processed by file system functions
such as readdir. This cause some memory overhead since a lot of data
is required to be kept resident for a longer period of time. Since the
lifetime of the data collected is relatively short there is not need
to pre-fetch all information like this. Instead handle file by file
and use only a single temporary object to hold whatever meta data is
necessary. The performance is also expected to be improved by a change
like this since less dynamic heap allocations are required but it also
results in a loop unwind that will increase number of functions calls.
Measurements of some common use-cases indicated a performance increase
of approximately 15%-20% but there are also reports of no improvement
at all or even the opposite. The latter should however be considered a
rare and exceptional case.

This change was triggered by issue #122 for which a very huge archive
was mounted with more than 100k files.

Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>
@jose1711
Copy link
Author

I saw a slight increase in memory usage with patched version during a 10-hour test and a -s swtich. Not sure if this is something to be worried about though.
obrázok
Today I really tried to get the core dump with either patched or non-patched version but I was not successful. Maybe it's time to close the issue..

@hasse69
Copy link
Owner

hasse69 commented Dec 11, 2019

If this is something to worry about depends heavily on what memory usage is really increasing.
I would be slightly worried if it was the RES (resident memory) since that would be used memory by the rar2fs process. But for VIRT (virtual memory or shared) it can really be anything. I did not see any RES memory increase in my tests though. Actually I did not see any increase at all past the point it hit maximum.

@jose1711
Copy link
Author

I don't really see RES column in the data provided by collectl. What I get is values for these:

[MEM]Tot [MEM]Used [MEM]Free [MEM]Shared [MEM]Buf [MEM]Cached [MEM]Slab
[MEM]Map [MEM]Anon [MEM]AnonH [MEM]Commit [MEM]Locked [MEM]SwapTot
[MEM]SwapUsed [MEM]SwapFree [MEM]SwapIn [MEM]SwapOut [MEM]Dirty [MEM]Clean
[MEM]Laundry [MEM]Inactive [MEM]PageIn [MEM]PageOut [MEM]PageFaults
[MEM]PageMajFaults [MEM]HugeTotal [MEM]HugeFree [MEM]HugeRsvd [MEM]SUnreclaim

hasse69 pushed a commit that referenced this issue Dec 29, 2019
When scanning an archive for files a linked list it created with all
files and properties before being processed by file system functions
such as readdir. This cause some memory overhead since a lot of data
is required to be kept resident for a longer period of time. Since the
lifetime of the data collected is relatively short there is not need
to pre-fetch all information like this. Instead handle file by file
and use only a single temporary object to hold whatever meta data is
necessary. The performance is also expected to be improved by a change
like this since less dynamic heap allocations are required but it also
results in a loop unwind that will increase number of functions calls.
Measurements of some common use-cases indicated a performance increase
of approximately 15%-20% but there are also reports of no improvement
at all or even the opposite. The latter should however be considered a
rare and exceptional case.

This change was triggered by issue #122 for which a very huge archive
was mounted with more than 100k files.

Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>
@hasse69
Copy link
Owner

hasse69 commented Jan 4, 2020

I think this issue can be closed, but there might come reasons to bring up the topic again later.
Closing.

@hasse69 hasse69 closed this as completed Jan 4, 2020
hasse69 pushed a commit that referenced this issue Jan 11, 2020
When scanning an archive for files a linked list it created with all
files and properties before being processed by file system functions
such as readdir. This cause some memory overhead since a lot of data
is required to be kept resident for a longer period of time. Since the
lifetime of the data collected is relatively short there is not need
to pre-fetch all information like this. Instead handle file by file
and use only a single temporary object to hold whatever meta data is
necessary. The performance is also expected to be improved by a change
like this since less dynamic heap allocations are required but it also
results in a loop unwind that will increase number of functions calls.
Measurements of some common use-cases indicated a performance increase
of approximately 15%-20% but there are also reports of no improvement
at all or even the opposite. The latter should however be considered a
rare and exceptional case.

This change was triggered by issue #122 for which a very huge archive
was mounted with more than 100k files.

Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants