Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddt_object_remove (from ddt_sync_entry()) panics. #4051

Closed
BenEstrabaud opened this issue Nov 27, 2015 · 2 comments
Closed

ddt_object_remove (from ddt_sync_entry()) panics. #4051

BenEstrabaud opened this issue Nov 27, 2015 · 2 comments

Comments

@BenEstrabaud
Copy link

Hi,

After starting a Volume block device (a similar Volume to LVM) and calling "zpool import" to import the previously created pool ontop of that Volume (single vdev pool), I get a "PANIC" from the ZFS driver in what seems to be the deduplication code:

<0>[ 1390.005410] VERIFY(ddt_object_remove(ddt, otype, oclass, dde, tx) == 0) failed
<0>[ 1390.013567] PANIC at ddt.c:1106:ddt_sync_entry()
<4>[ 1390.018797] Showing stack for process 24500
<4>[ 1390.023570] CPU: 0 PID: 24500 Comm: txg_sync Tainted: P O 4.1.5 #15
<4>[ 1390.031969] Hardware name: Newisys NDS_SB1EA/NDS_SB1EA, BIOS 9.20 07/28/2015
<4>[ 1390.031971] ffffffffa0695cff ffff881eb48d79e8 ffffffff816b4518 0000000000000000
<4>[ 1390.031972] ffff881eb48d7a18 ffff881eb48d79f8 ffffffffa0538fe4 ffff881eb48d7b88
<4>[ 1390.031973] ffffffffa05391b7 ffff881eb48d7ab8 ffffffffa069f93f 6428594649524556
<4>[ 1390.031974] Call Trace:
<4>[ 1390.031984] [] dump_stack+0x48/0x60
<4>[ 1390.031989] [] spl_dumpstack+0x44/0x50 [spl]
<4>[ 1390.031992] [] spl_panic+0xa7/0xe0 [spl]
<4>[ 1390.032009] [] ? fzap_byteswap+0x831/0x850 [zfs]
<4>[ 1390.032016] [] ? dbuf_rele+0x40/0x50 [zfs]
<4>[ 1390.032021] [] ? dmu_buf_rele+0xe/0x10 [zfs]
<4>[ 1390.032034] [] ? zap_unlockdir+0x4a/0xc0 [zfs]
<4>[ 1390.032036] [] ? spl_kmem_cache_free+0x16c/0x1e0 [spl]
<4>[ 1390.032044] [] ddt_sync+0x2e6/0xa40 [zfs]
<4>[ 1390.032057] [] ? zio_buf_free+0x251/0x12f0 [zfs]
<4>[ 1390.032068] [] ? zio_wait+0x15b/0x1e0 [zfs]
<4>[ 1390.032081] [] spa_sync+0x3bb/0xe80 [zfs]
<4>[ 1390.032083] [] ? __wake_up_common+0x59/0x90
<4>[ 1390.032084] [] ? __wake_up+0x53/0x70
<4>[ 1390.032087] [] ? getrawmonotonic64+0x3f/0xd0
<4>[ 1390.032099] [] txg_init+0x614/0x890 [zfs]
<4>[ 1390.032102] [] ? kfree+0x108/0x140
<4>[ 1390.032112] [] ? txg_init+0x240/0x890 [zfs]
<4>[ 1390.032115] [] ? __thread_create+0x160/0x1f0 [spl]
<4>[ 1390.032118] [] __thread_create+0x1d8/0x1f0 [spl]
<4>[ 1390.032120] [] ? __thread_create+0x160/0x1f0 [spl]
<4>[ 1390.032122] [] kthread+0xce/0x100
<4>[ 1390.032124] [] ? kthread_freezable_should_stop+0x70/0x70
<4>[ 1390.032125] [] ret_from_fork+0x42/0x70
<4>[ 1390.032127] [] ? kthread_freezable_should_stop+0x70/0x70
<6>[ 1428.259556] ixgbe 0000:01:00.1 eth3: NIC Link is Down

That issue only happened once after extensive export/import testing (think tens of iterations).

it seems to be like a "BUG ON" condition, where an assertion fails (ddt_object_remove(ddt, otype, oclass, dde, tx) == 0). Any idea what could be causing this?

@behlendorf behlendorf added this to the 0.7.0 milestone Dec 10, 2015
@behlendorf
Copy link
Contributor

@BenEstrabaud it would be great if you could tweak the code slightly so we can get the errno returned. This should work, then we may be able to say why.

diff --git a/module/zfs/ddt.c b/module/zfs/ddt.c
index 12c1b73..9df77cf 100644
--- a/module/zfs/ddt.c
+++ b/module/zfs/ddt.c
@@ -1103,7 +1103,7 @@ ddt_sync_entry(ddt_t *ddt, ddt_entry_t *dde, dmu_tx_t *tx,

        if (otype != DDT_TYPES &&
            (otype != ntype || oclass != nclass || total_refcnt == 0)) {
-               VERIFY(ddt_object_remove(ddt, otype, oclass, dde, tx) == 0);
+               VERIFY0(ddt_object_remove(ddt, otype, oclass, dde, tx));
                ASSERT(ddt_object_lookup(ddt, otype, oclass, dde) == ENOENT);
        }

@behlendorf behlendorf modified the milestones: 0.8.0, 0.7.0 Mar 26, 2016
@behlendorf behlendorf removed this from the 0.8.0 milestone Feb 9, 2018
@loli10K
Copy link
Contributor

loli10K commented Sep 11, 2018

Duplicate of #1681

@loli10K loli10K marked this as a duplicate of #1681 Sep 11, 2018
@loli10K loli10K closed this as completed Sep 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants