-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs send fails after non-simultaneous use of bees #115
Comments
OK, so send had two bugs after all. Are you willing to report this to the linux-btrfs mailing list? |
I've never posted to the mailing list before but I'm sure I can figure out how. What about the kernel bugzilla? I've posted there before. |
Kernel bugzilla is worth a shot. Best case, it's a simple fix. Worst case, it'll get ignored, and I'll forward it to the mailing list when I get a few spare cycles (possibly after a bit more analysis). |
Does this happen with the other |
It's a kernel bug, so it should affect all deduplicators on btrfs. |
So I tried the following: $ mkfs.btrfs -f /dev/sdb $ xfs_io -f -c "pwrite -S 0xab 0 1M" /mnt/foo $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ xfs_io -f -c "pwrite -S 0xab 0 1M" /mnt/bar $ btrfs subvolume snapshot -r /mnt /mnt/snap2 # deduplicate foo into bar, so that both point to the same extent(s) # do the incremental send, see if it fails dmesg/syslog is also clean. Applying send streams to a filesystem also shows both files are there and with correct content. Can you provide more details on how the deduplication is being done exactly? Full, just same extents, order, etc. Thanks. |
I'm not sure how |
Where is this temporary file created? It does seem plausible that this could be the cause for the send error. Or rather, is send correct that the ro snapshots did in fact change? |
@Gatak More or less "nowhere"... I think bees creates it in the root subvolume, acquires an open file descriptor of it, then immediately deletes it, only then it's writing file data to the FD. So it's writing to an anonymous, btrfs-backed file. If your reasoning behind the question is if the file is created in the RO snapshot: No, it isn't. @Zygo may know more but probably deduping extents from the RO snapshot to this newly created file removes the original extents and thus "modifies" the snapshot (not the file contents but the extent structure). But I think exactly that should've been fixed in bees already by ignoring RO snapshots. |
Then space/duplicates taken by RO snapshots is also not considered or reduced, which was the point to start with, wasn't it? |
@Gatak, @kakra: Temporary files are created in the root subvol with To the snapshots, the bees temporary files are just files in another subvolume with a higher transid. They should have the same effect as the second pwrite. |
@fdmanana That script looks right, in the sense that bees does something similar, but I haven't reproduced this myself, and I don't think it's quite that simple. We probably need to set up a larger, more realistic test (e.g. copy /usr into a subvol instead of just one extent), run bees and send until it fails, then try to figure out what happened to the filesystem when the error is detected. |
So I managed to find out how it happens exactly, it's not that trivial to reproduce and happens sort of randomly, no wonder why I have not ever hit it or had other user reports before. I'll send a fix soon (this week) to the btrfs mailing list. No need to use bees for triggering this. Thanks. |
That sounds familiar: https://www.spinics.net/lists/linux-btrfs/msg45113.html Oops, my bad? ;) |
Kernel fix queued for 5.3 and will appear in the stable trees. |
@kdave could you let us know when it should have appeared in stable trees? I'm waiting on this before trying bees to ensure btrbk backups continue to function. |
According to
it is in 5.2.7, 4.19.65, 4.14.137, and 4.9.188. |
@HaleTom - I'm just diving into btrbk and bees. I'm seeing this issue is still open, so I'm wondering if it worked for you? |
This was fixed in kernel 5.3, and the oldest usable kernel is 5.4, so it's time to close this issue. |
After running bees on a filesystem containing a parent snapshot, and then trying to do an incremental send from that snapshot, after bees has been run in between, but not simultaneously (bees fully stopped before starting the send), the send fails with the following in dmesg:
This is with kernel 5.1.8 and bees 0.6.1.
The text was updated successfully, but these errors were encountered: