Make KM_SLEEP an alias of KM_PUSHPAGE #145

ryao · 2012-07-29T13:47:29Z

Use GFP_NOIO in KM_SLEEP

This should prevent direct reclaim issues without requiring
Linux-specific changes to code from Solaris. This is what is done in
FreeBSD.

Note that a change to __taskq_dispatch() module/spl/spl-taskq.c is
needed to make this work. Changing KM_PUSHPAGE to use GFP_NOIO is fine,
but adding __GFP_HIGH to that triggers a hard-coded panic in
__taskq_dispatch() during zvol initialization. Removing the hard coded
panic has no ill effects.

Signed-off-by: Richard Yao [email protected]

Use GFP_NOIO in KM_SLEEP This should prevent direct reclaim issues without requiring Linux-specific changes to code from Solaris. This is what is done in FreeBSD. Note that a change to __taskq_dispatch() module/spl/spl-taskq.c is needed to make this work. Changing KM_PUSHPAGE to use GFP_NOIO is fine, but adding __GFP_HIGH to that triggers a hard-coded panic in __taskq_dispatch() during zvol initialization. Removing the hard coded panic has no ill effects. Signed-off-by: Richard Yao <[email protected]>

behlendorf · 2012-07-31T21:36:18Z

The panic occurs because zvol_dispatch() uses TQ_NOSLEEP.

TQ_NOSLEEP -> KM_NOSLEEP -> GFP_ATOMIC -> (__GFP_HIGH)
TQ_SLEEP -> KM_SLEEP -> (GFP_NOIO | __GFP_HIGH)

Prior to this change the two options didn't have any bits in common, but now they both share __GFP_HIGH which causes us to incorrectly trigger the panic(). This could be changed to an ASSERT() which explicitly checks __GFP_WAIT since is primarily for the developers benefit. I'm also OK with dropping it entirely because modern kernels already include checks for this.

With this patch applied I've also noticed page allocation failures usually during module init. By preventing all IO during allocations we're making it harder for the system to get large allocations. That concerns me because the slab implementation depends on fairly large allocation sizes.

insmod: page allocation failure. order:7, mode:0xd0

behlendorf · 2012-07-31T23:37:21Z

This change also appears to destabilize the splat memory tests on all of my test builders. Presumably this is due to greatly reduced flexibly allowed to the VM on how it may satisfy large memory allocations.

ryao · 2012-08-01T04:16:24Z

@behlendorf Would you describe the conditions under which the page allocation failure was triggered? You are the only person to have reported this.

Also, would you elaborate on what you mean when you say that the splat memory tests are being destabilized?

This patch seems to improve system stability when doing swap on zvols, although I need to increase vm.min_free_kbytes and I can still deadlock the system under enough strain. My suspicion is that external memory fragmentation inside the kernel causes this and that increasing vm.min_free_kbytes permits allocations to succeed more often by decreasing the severity of memory fragmentation.

I ran the splat tests a few times. My system was under memory stress during the first test, which resulted in a deadlock in kmem:slab_lock. Two successive times after a reboot finished, but there were a few failures:

http://bpaste.net/show/37856/

The failures might explain the claims by some that ARC uses an excessive amount of memory. At this point, I think I understand the slab well enough to attempt a partial rewrite. I will try to make that a priority. My hope is that it will result in a solution to these problems.

pyavdr · 2012-08-01T04:55:21Z

@ryao
@behlendorf
Im happy to see you both working together again for a solution for the memory management of ZOL. There only can be winners if you both find a good solution.

ryao · 2012-08-16T19:41:52Z

I made three attempts at rewriting the SLAB implementation. Each one produced something that worked, but none of them solved the original problem.

pyavdr · 2012-08-19T05:15:50Z

@ryao Have you ever seen that SLUB allocator ? http://lwn.net/Articles/229984/ ... Is there any chance that this may help ?

ryao · 2012-08-20T01:03:29Z

@pyavdr I was already using it.

behlendorf · 2012-08-23T04:45:20Z

This should no longer be needed see #161 and openzfs/zfs#883

ryao mentioned this pull request Aug 1, 2012

Use Linux SLAB allocator for SPL SLAB allocations #147

Closed

behlendorf closed this Aug 23, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make KM_SLEEP an alias of KM_PUSHPAGE #145

Make KM_SLEEP an alias of KM_PUSHPAGE #145

ryao commented Jul 29, 2012

behlendorf commented Jul 31, 2012

behlendorf commented Jul 31, 2012

ryao commented Aug 1, 2012

pyavdr commented Aug 1, 2012

ryao commented Aug 16, 2012

pyavdr commented Aug 19, 2012

ryao commented Aug 20, 2012

behlendorf commented Aug 23, 2012

Make KM_SLEEP an alias of KM_PUSHPAGE #145

Make KM_SLEEP an alias of KM_PUSHPAGE #145

Conversation

ryao commented Jul 29, 2012

behlendorf commented Jul 31, 2012

behlendorf commented Jul 31, 2012

ryao commented Aug 1, 2012

pyavdr commented Aug 1, 2012

ryao commented Aug 16, 2012

pyavdr commented Aug 19, 2012

ryao commented Aug 20, 2012

behlendorf commented Aug 23, 2012