-
Notifications
You must be signed in to change notification settings - Fork 178
Conversation
Use GFP_NOIO in KM_SLEEP This should prevent direct reclaim issues without requiring Linux-specific changes to code from Solaris. This is what is done in FreeBSD. Note that a change to __taskq_dispatch() module/spl/spl-taskq.c is needed to make this work. Changing KM_PUSHPAGE to use GFP_NOIO is fine, but adding __GFP_HIGH to that triggers a hard-coded panic in __taskq_dispatch() during zvol initialization. Removing the hard coded panic has no ill effects. Signed-off-by: Richard Yao <[email protected]>
The panic occurs because zvol_dispatch() uses TQ_NOSLEEP.
Prior to this change the two options didn't have any bits in common, but now they both share __GFP_HIGH which causes us to incorrectly trigger the panic(). This could be changed to an ASSERT() which explicitly checks __GFP_WAIT since is primarily for the developers benefit. I'm also OK with dropping it entirely because modern kernels already include checks for this. With this patch applied I've also noticed page allocation failures usually during module init. By preventing all IO during allocations we're making it harder for the system to get large allocations. That concerns me because the slab implementation depends on fairly large allocation sizes. insmod: page allocation failure. order:7, mode:0xd0 |
This change also appears to destabilize the splat memory tests on all of my test builders. Presumably this is due to greatly reduced flexibly allowed to the VM on how it may satisfy large memory allocations. |
@behlendorf Would you describe the conditions under which the page allocation failure was triggered? You are the only person to have reported this. Also, would you elaborate on what you mean when you say that the splat memory tests are being destabilized? This patch seems to improve system stability when doing swap on zvols, although I need to increase vm.min_free_kbytes and I can still deadlock the system under enough strain. My suspicion is that external memory fragmentation inside the kernel causes this and that increasing vm.min_free_kbytes permits allocations to succeed more often by decreasing the severity of memory fragmentation. I ran the splat tests a few times. My system was under memory stress during the first test, which resulted in a deadlock in kmem:slab_lock. Two successive times after a reboot finished, but there were a few failures: The failures might explain the claims by some that ARC uses an excessive amount of memory. At this point, I think I understand the slab well enough to attempt a partial rewrite. I will try to make that a priority. My hope is that it will result in a solution to these problems. |
@ryao |
I made three attempts at rewriting the SLAB implementation. Each one produced something that worked, but none of them solved the original problem. |
@ryao Have you ever seen that SLUB allocator ? http://lwn.net/Articles/229984/ ... Is there any chance that this may help ? |
@pyavdr I was already using it. |
This should no longer be needed see #161 and openzfs/zfs#883 |
Use GFP_NOIO in KM_SLEEP
This should prevent direct reclaim issues without requiring
Linux-specific changes to code from Solaris. This is what is done in
FreeBSD.
Note that a change to __taskq_dispatch() module/spl/spl-taskq.c is
needed to make this work. Changing KM_PUSHPAGE to use GFP_NOIO is fine,
but adding __GFP_HIGH to that triggers a hard-coded panic in
__taskq_dispatch() during zvol initialization. Removing the hard coded
panic has no ill effects.
Signed-off-by: Richard Yao [email protected]