Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fscache: fix race between enablement and dropping of object
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] neagix#6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] neagix#7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] neagix#8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 neagix#9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] neagix#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 neagix#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa neagix#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 neagix#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]>
- Loading branch information