VM disk corruption with Apple Silicon #1957

EdwardMoyse · 2023-10-27T08:09:28Z

Tip

For --vm-type=vz, this issue seems to have been solved in Lima v0.19 (#2026)

Description

Lima version: 0.18.0
macOS: 14.0 (23A344)
VM: Almalinux9

I was trying to do a big compile, using a VM with the attached configuration (vz)

NAME           STATUS     SSH                VMTYPE    ARCH       CPUS    MEMORY    DISK      DIR
myalma9        Running    127.0.0.1:49434    vz        aarch64    4       16GiB     100GiB    ~/.lima/myalma9

The build aborted with:

from /Volumes/Lima/build/build/AthenaExternals/src/Geant4/source/processes/hadronic/models/lend/src/[xDataTOM_LegendreSeries.cc](http://xdatatom_legendreseries.cc/):7:
/usr/include/bits/types.h:142:10: fatal error: /usr/include/bits/time64.h: Input/output error

And afterwards, even in a different terminal, I see:

[emoyse@lima-myalma9 emoyse]$ ls
bash: /usr/bin/ls: Input/output error

I was also logged into a display, and there I saw e.g.

If I try to log in again with:

limactl shell myalma9

each time I see something like the following appear in the display window:

[56247.6427031] Core dump to l/usr/lib/systemd/systemd-coredump pipe failed

Edit: there has been a lot of discussion below, and the corruption can happen with both vz and qemu, and on external (to the VM) and internal disks. Some permutations seem more likely to provoke a corruption than others. I have reproduced my experiments in the table in the following comment below.

The text was updated successfully, but these errors were encountered:

EdwardMoyse · 2023-10-27T08:14:12Z

In case it is relevant, I was compiling in a separate APFS (Case-sensitive) Volume as described here. This volume seems absolutely fine - so the corruption seems limited to the VM itself. I can't see how this could have happened with 100GB but I wonder if it's possible that it ran out of space? I could try increasing the disk size, but the whole point of using an external volume was that this would not be necessary.

AkihiroSuda · 2023-10-27T16:50:44Z

Is this specific to AlmaLinux 9?
Is this specific to --vm-type=vz?
Do you see some meaningful message in dmesg?

EdwardMoyse · 2023-10-29T13:54:31Z

Hmm. I just tried again but compiling in /tmp rather than the case-sensitive volume, and this worked fine. A colleague has confirmed a similar experience - problems with /Volumes/Lima, but works fine in /tmp. So my best guess right now is it is some interaction with an APFS Volume and Lima (which might also explain the following "stuck VM" discussion : #1666)

Answering your other questions:

I'm not sure if it is specific to AlmaLinux9 since my use-case requires that particular OS. But I will try doing a big compile on a different OS and in a Volume and to see if I can replicate.
We have not replicated with qemu but this is so slow that this is very hard to do. I will try.
I will also try again with dmesg running.

EdwardMoyse · 2023-10-30T10:32:10Z

EDIT: One potential feature could be to be able create disk images on an attached disk, instead of under LIMA_HOME.
You can probably use symlinks from _disks as a workaround, but would be better with some optional flag support...

This would, I think, really help us.

Our use-case is this - we want to be able to edit files from within macOS, but then compile inside Almalinux9. The codebase we are compiling is relatively large (>4 million lines of C++) and can take up to 400 GB of temporary compilation space. I was reluctant to make separate VMs with this much local storage, especially since a lot of us will be working on laptops. Ideally we would have a large build area (possibly on an external drive), accessible from several VMs, and with very fast disk io to the VM (since otherwise the build time can become unusably slow). We do NOT, in general, need to be able to access this build area from the host (at least, not with fast io - it would mainly be to examine compilation failures)

(I will get back to the other tests shortly - but I'm currently travelling with limited work time, and it seems very likely that the issue is related to compiling outside the VM)

AkihiroSuda · 2023-10-30T10:35:35Z

(I will get back to the other tests shortly - but I'm currently travelling with limited work time, and it seems very likely that the issue is related to compiling outside the VM)

I'm not sure how virtiofs affects the XFS disk, but maybe this issue should be reported to Apple?

afbjorklund · 2023-10-30T12:14:18Z

I was under the impression that the problem was with the /Volumes/Lima mount, but the logs say vda2...

  - location: /Volumes/Lima
    writable: true

So the remote filesystem is a separate topic*, from this ARM64 disk corruption. Sorry for the added noise.

Though I don't see how switching from remote /Volumes/Lima to local /tmp could have helped, then...

* should continue in a different discussion

Note that disk images cannot be shared...
(they can be unplugged and remounted)

AkihiroSuda · 2023-10-30T12:19:05Z

Is this relevant?

Apple Virtual machine keeps setting itself as read-only utmapp/UTM#4840

(UTM uses vz too)

Looks like people began to hit this issue since September, so I wonder if Apple introduced a regression on that time?

I still can't repro the issue locally though.
(macOS 14.1 on Intel MacBookPro 2020, macOS 13.5.2 on EC2 mac2-m2pro)

AkihiroSuda · 2023-10-30T12:22:37Z

Can anybody confirm this rumor?

utmapp/UTM#4840 (comment)

Is it me or deactivating ballooning solves the problem?
I've deactivated it two weeks ago, and no problem since on my side.

Removing these lines will disable ballooning:

lima/pkg/vz/vm_darwin.go

Lines 598 to 604 in 7cb2b2e

    
           configuration, err := vz.NewVirtioTraditionalMemoryBalloonDeviceConfiguration() 
        
           if err != nil { 
        
           	return err 
        
           } 
        
           vmConfig.SetMemoryBalloonDevicesVirtualMachineConfiguration([]vz.MemoryBalloonDeviceConfiguration{ 
        
           	configuration, 
        
           })

wdormann · 2023-10-30T13:01:56Z

For what it's worth, I believe I've narrowed down the problem that I've noticed in utmapp/UTM#4840 to having used an external SSD drive. I've not reproduced the corruption if the VM lives on my Mac's internal storage.

@EdwardMoyse Your separate APFS volume... is it on the same storage device that your Mac runs on, or is it a separate external device?

@AkihiroSuda I've not seen disabling the Balloon device to help with preventing corruption. At least, if I'm working with a QEMU-based VM that lives on my external SSD storage, it has Balloon Device un-checked by default, and the VM's filesystem will eventually corrupt under heavy disk load. So I believe this is a red herring.

AkihiroSuda · 2023-10-30T13:04:26Z

I'm working with a QEMU-based VM

Probably, you are hitting a different issue with a similar symptom ?

EdwardMoyse · 2023-10-30T13:16:02Z

@wdormann my APFS Volume is on same device (SSD) as macOS. It's not an external device in my case.

wdormann · 2023-10-30T13:57:29Z

Thanks for the input. I've been testing the disk itself, and it has yet to report errors.
Given your successful test in /tmp, these both seem to point to a problem using a non-OS volume for the underlying VM OS storage?

AkihiroSuda · 2023-10-30T14:38:11Z

I think I reproduced the issue with the default Ubuntu template:

[  299.527200] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3793: comm apport: iget: checksum invalid
[  299.527255] Aborting journal on device vda1-8.
[  299.527293] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm cp: Detected aborted journal
[  299.528985] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[  299.530464] EXT4-fs (vda1): Remounting filesystem read-only
[  299.530515] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3794: comm apport: iget: checksum invalid
[  299.535137] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3795: comm apport: iget: checksum invalid
[  299.538878] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3796: comm apport: iget: checksum invalid
[  299.543827] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3797: comm apport: iget: checksum invalid
[  299.550614] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3798: comm apport: iget: checksum invalid
[  299.551947] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3799: comm apport: iget: checksum invalid
[  299.553651] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3800: comm apport: iget: checksum invalid
[  299.821872] audit: type=1131 audit(1698675832.913:271): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
[  299.821967] BUG: Bad rss-counter state mm:0000000013fa5858 type:MM_FILEPAGES val:43
[  299.821980] BUG: Bad rss-counter state mm:0000000013fa5858 type:MM_ANONPAGES val:3
[  299.821982] BUG: non-zero pgtables_bytes on freeing mm: 4096
[  299.822551] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
[  299.822566] Mem abort info:
[  299.822566]   ESR = 0x0000000096000004
[  299.822568]   EC = 0x25: DABT (current EL), IL = 32 bits
[  299.822569]   SET = 0, FnV = 0
[  299.822570]   EA = 0, S1PTW = 0
[  299.822570]   FSC = 0x04: level 0 translation fault
[  299.822571] Data abort info:
[  299.822572]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  299.822573]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  299.822574]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  299.822575] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100970000
[  299.822576] [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
[  299.822604] Internal error: Oops: 0000000096000004 [#1] SMP
[  299.822615] Modules linked in: tls nft_chain_nat overlay xt_tcpudp xt_nat xt_multiport xt_mark xt_conntrack xt_comment xt_addrtype xt_MASQUERADE nf_tables nfnetlink ip6table_filter iptable_filter ip6table_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6_tables veth bridge stp llc tap isofs binfmt_misc nls_iso8859_1 vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock virtiofs joydev input_leds drm 
[  299.822800] Unable to handle kernel paging request at virtual address fffffffffffffff8
[  299.822805] Mem abort info:
[  299.822805]   ESR = 0x0000000096000004
[  299.822806]   EC = 0x25: DABT (current EL), IL = 32 bits
[  299.822807]   SET = 0, FnV = 0
[  299.822808]   EA = 0, S1PTW = 0
[  299.822809]   FSC = 0x04: level 0 translation fault
[  299.822810] Data abort info:
[  299.822810]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  299.822811]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  299.822812]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  299.822813] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000864e50000
[  299.822814] [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
[  361.102020] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  361.102094] rcu:     1-...0: (1 GPs behind) idle=e0b4/1/0x4000000000000000 softirq=23608/23609 fqs=6997
[  361.102102] rcu:              hardirqs   softirqs   csw/system
[  361.102103] rcu:      number:        0          0            0
[  361.102104] rcu:     cputime:        0          0            0   ==> 30000(ms)
[  361.102105] rcu:     (detected by 3, t=15002 jiffies, g=38213, q=860 ncpus=4)
[  361.102107] Task dump for CPU 1:
[  361.102108] task:systemd         state:S stack:0     pid:1     ppid:0      flags:0x00000002
[  361.102111] Call trace:
[  361.102118]  __switch_to+0xc0/0x108
[  361.102180]  seccomp_filter_release+0x40/0x78
[  361.102203]  release_task+0xf0/0x238
[  361.102216]  wait_task_zombie+0x124/0x5c8
[  361.102218]  wait_consider_task+0x244/0x3c0
[  361.102220]  do_wait+0x178/0x338
[  361.102222]  kernel_waitid+0x100/0x1e8
[  361.102224]  __do_sys_waitid+0x2bc/0x378
[  361.102226]  __arm64_sys_waitid+0x34/0x60
[  361.102228]  invoke_syscall+0x7c/0x128
[  361.102230]  el0_svc_common.constprop.0+0x5c/0x168
[  361.102231]  do_el0_svc+0x38/0x68
[  361.102232]  el0_svc+0x30/0xe0
[  361.102234]  el0t_64_sync_handler+0x148/0x158
[  361.102236]  el0t_64_sync+0x1b0/0x1b8
[  541.118359] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  541.118368] rcu:     1-...0: (1 GPs behind) idle=e0b4/1/0x4000000000000000 softirq=23608/23609 fqs=27191
[  541.118371] rcu:              hardirqs   softirqs   csw/system
[  541.118372] rcu:      number:        0          0            0
[  541.118373] rcu:     cputime:        0          0            0   ==> 210020(ms)
[  541.118375] rcu:     (detected by 3, t=60007 jiffies, g=38213, q=1790 ncpus=4)
[  541.118377] Task dump for CPU 1:
[  541.118379] task:systemd         state:S stack:0     pid:1     ppid:0      flags:0x00000002
[  541.118382] Call trace:
[  541.118383]  __switch_to+0xc0/0x108
[  541.118390]  seccomp_filter_release+0x40/0x78
[  541.118393]  release_task+0xf0/0x238
[  541.118396]  wait_task_zombie+0x124/0x5c8
[  541.118399]  wait_consider_task+0x244/0x3c0
[  541.118401]  do_wait+0x178/0x338
[  541.118403]  kernel_waitid+0x100/0x1e8
[  541.118405]  __do_sys_waitid+0x2bc/0x378
[  541.118407]  __arm64_sys_waitid+0x34/0x60
[  541.118409]  invoke_syscall+0x7c/0x128
[  541.118411]  el0_svc_common.constprop.0+0x5c/0x168
[  541.118412]  do_el0_svc+0x38/0x68
[  541.118413]  el0_svc+0x30/0xe0
[  541.118415]  el0t_64_sync_handler+0x148/0x158
[  541.118417]  el0t_64_sync+0x1b0/0x1b8

(Non-minimum, non-deterministic) repro steps:

Create a mac2-m2pro (32GB RAM) instance on EC2, with macOS 13.5.2 AMI, and a gp2 EBS volume
Install Lima v0.18.0
Run limactl start --vm-type=vz --cpus=4 --memory=32 --disk=100 --name=vm1
Run limactl start --vm-type=vz --cpus=4 --memory=32 --disk=100 --name=vm2
For each of the VMs, run cp -a /Users/ec2-user/some-large-directory ~.
Some of them may fail with cp: ...: Read-only filesystem

Filesystems:

% mount
/dev/disk5s2s1 on / (apfs, sealed, local, read-only, journaled)
devfs on /dev (devfs, local, nobrowse)
/dev/disk5s5 on /System/Volumes/VM (apfs, local, noexec, journaled, noatime, nobrowse)
/dev/disk5s3 on /System/Volumes/Preboot (apfs, local, journaled, nobrowse)
/dev/disk1s2 on /System/Volumes/xarts (apfs, local, noexec, journaled, noatime, nobrowse)
/dev/disk1s1 on /System/Volumes/iSCPreboot (apfs, local, journaled, nobrowse)
/dev/disk1s3 on /System/Volumes/Hardware (apfs, local, journaled, nobrowse)
/dev/disk5s1 on /System/Volumes/Data (apfs, local, journaled, nobrowse)
map auto_home on /System/Volumes/Data/home (autofs, automounted, nobrowse)
/dev/disk3s4 on /private/tmp/tmp-mount-mDoJ7V (apfs, local, journaled, nobrowse)

% stat -f %Sd / 
disk5s1

% stat -f %Sd /Users/ec2-user/.lima         
disk5s1

The VM disk is located in the default path ~/.lima.

AkihiroSuda · 2023-10-30T15:14:31Z

Tried to remove the balloon, but the filesystem still seems to break intermittently

[ 1674.027587] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35601: comm apport: iget: checksum invalid
[ 1674.030317] Aborting journal on device vda1-8.
[ 1674.031818] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[ 1674.031896] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal
[ 1674.033116] EXT4-fs (vda1): Remounting filesystem read-only
[ 1674.033147] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35602: comm apport: iget: checksum invalid
[ 1674.036501] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35603: comm apport: iget: checksum invalid
[ 1674.037738] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35604: comm apport: iget: checksum invalid
[ 1674.038828] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35605: comm apport: iget: checksum invalid
[ 1674.040034] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid
[ 1674.041091] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid
[ 1674.042199] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid

EdwardMoyse · 2023-10-31T12:04:25Z

Thanks for the input. I've been testing the disk itself, and it has yet to report errors. Given your successful test in /tmp, these both seem to point to a problem using a non-OS volume for the underlying VM OS storage?

I might be perhaps misunderstanding you, but I don't think I am using "non-OS volume for the underlying VM OS storage".

For clarity, here is my setup:

I create a VM using the standard limactl start almalinux9.yaml --name=alma9, and the VM exists on the main macOS volume.
I create a separate APFS (Case-sensitive) Volume, and make it mountable from within the VM:

- location: /Volumes/Lima
   writable: true

if I compile our software in /Volumes/Lima I get disk corruption, if I use /tmp for the same operation, it works fine.

So I would characterise this as rathe ra problem using a non-OS volume for the intensive disk operations from within the VM.

wdormann · 2023-10-31T12:22:58Z

I'll admit I'm not familiar with Lima.
When you say "make it mountable from within the VM", what does that mean?

You have a virtual hard disk file that lives on that separate APFS volume, and your VM is configured to have that as a second disk drive?
You boot the VM, and somehow from Linux user/kernel land mount your /Volumes/Lima directory? (How?)
Something else?

Perhaps Lima does this all for you under the hood, but I suppose that I'd need to know exactly what it's doing to have any hope of understanding what's going on.

EdwardMoyse · 2023-10-31T12:59:17Z

I'll admit I'm not familiar with Lima. When you say "make it mountable from within the VM", what does that mean?

You have a virtual hard disk file that lives on that separate APFS volume, and your VM is configured to have that as a second disk drive?

You boot the VM, and somehow from Linux user/kernel land mount your /Volumes/Lima directory? (How?)

It's the latter (but I cannot tell you any technicalities how it works). From within both the host and the VM I can access /Volumes/Lima. See https://lima-vm.io/docs/config/mount/

wdormann · 2023-10-31T13:11:46Z

Do you specify a mount type in your limactl command line and/or config file?
Or, from the VM, what does the mount command report for the filesystem in question?

EdwardMoyse · 2023-11-15T14:27:27Z

My apologies for the delay in replying, but i have been looking into this. The workflow is the same - compile https://gitlab.cern.ch/atlas/atlasexternals using the attached template with various configurations of host, qemu/vz, cores and memory.

TLDR; updating to 6.5.10-1 was more stable on M2 (even on 'shared' volume /tmp/lima), but apparently worse on M1 Pro (though the M1Pro has more cores and we pushed this a lot harder). Updating to 6.6.1 was better on M1 Pro (have not tested M2 yet) but got xfs corruption at the very end.

With 6.6.1 I also disabled sleeping on guest:

sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

(from hint here)

VM Type	Kernel	Cores	RM (GB)	Where	Attempt 1	Attempt 2	Attempt 3	Host Processor
qemu	5.14	6	24	/tmp	Crash + xfs	Crash + xfs	Crash + xfs	M1 Pro
vz	5.14	6	24	/Volumes/Lima	Crash + xfs			M1 Pro
vz	5.14	6	24	/tmp	OK			M1 Pro
qemu	5.6.10.1	6	24	/tmp	OK (but slow)			M1 Pro
vz	5.6.10.1	6	24	/Volumes/Lima	Crash + xfs			M1 Pro
vz	5.6.10.1	6	24	/tmp	Crash a	Crash b		M1 Pro
vz	6.6.1	6	24	/tmp	xfs			M1 Pro
vz	6.6.2-1	4	12	/home/emoyse.linux	xfs			M1 Pro

Notes:

xfs means xfs corruption was reported.
Once xfs corruption has occurred, I trash the VM and restart
Often crash is preceded in dmesg by e.g "hrtimer: interrupt took 32332585ns"
crash a in /var/log/messages I see :

978.3062161 BUG: Bad rss-counter state mm:0000000076c5940f type:M_FILEPAGES val: 402
[978.3067761 BUG: Bad rss-counter state mm:0000000076c5940f type:MM_ANONPAGES val:206
978.3071421 BUG: non-zero pgtables_bytes on freeing mm: 69632
[+0.0116951 BUG: Bad rss-counter state mm:0000000076c5940f type:MM FILEPAGES val: 402

crash b I see:

Nov 7 16:44:19 lima-myalma92 kernel: BUG: workqueue lockup - pool cpus=5 node=0 flags=0x0 nice=0 stuck for 2164s!
Nov 7 16:44:19 lima-myalma92 kernel: Showing busy workqueues and worker pools:
Nov 7 16:44:19 lima-myalma92 kernel: workqueue events: flags=0x0
Nov 7 16:44:19 lima-myalma92 kernel: pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
Nov 7 16:44:19 lima-myalma92 kernel: pending: drm_fb_helper_damage_work [drm_kms_helper]
Nov 7 16:44:19 lima-myalma92 kernel: workqueue mm_percpu_wq: flags=0x8
Nov 7 16:44:19 lima-myalma92 kernel: pwq 10: cpus=5 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
Nov 7 16:44:19 lima-myalma92 kernel: pending: vmstat_update

for the last run with 6.6.1 it all completed fine and looked great, but then I got:

[emoyse@lima-alma9661c6 tmp]$ ls
bash: /usr/bin/ls: Input/output error

And in the display I see:

wdormann · 2023-11-17T18:30:19Z

FWIW, I've added some test results and comments here: utmapp/UTM#4840 (comment)

I've not ruled out that there is some issue with the macOS filesystem/hypervisor layer, but I've only seen corruption with a Linux VM, and not macOS or Windows doing the exact same thing, from the exact same VM disk backing. What is interesting to me is that if I take the exact same disk and reformat it as APFS instead of ExFAT, Linux 6.5.6 or 6.4.15 will not experience disk corruption. My theory is that given an unfortunate combination of speed/latency/something-else for disk backing, a Linux VM might experience disk corruption.

AkihiroSuda · 2023-11-17T18:51:07Z

My theory is that given an unfortunate combination of speed/latency/something-else for disk backing, a Linux VM might experience disk corruption.

Could you submit your insight to Apple? Probably via https://www.apple.com/feedback/macos.html

wdormann · 2023-11-17T21:57:59Z

I have, just to hedge my bets.
However, if Windows, macOS, and I just recently tested FreeBSD, all work flawlessly under the exact same workload, using the same host disk backing, and only Linux has a problem, I'd say that this is a Linux problem. Not Apple.

afbjorklund · 2023-11-19T09:20:21Z

I can trigger filesystem corruption if my external disk is formatted with ExFAT

Oh, so that might be why it is mostly affecting external disks ? Did people forget to (re-)format them before using ?

EDIT: no, not so simple

"I create a separate APFS (Case-sensitive) Volume,"

EdwardMoyse · 2023-11-20T13:02:58Z

I can trigger filesystem corruption if my external disk is formatted with ExFAT

Oh, so that might be why it is mostly affecting external disks ? Did people forget to (re-)format them before using ?

EDIT: no, no so simple

"I create a separate APFS (Case-sensitive) Volume,"

And for me, I'm not using external (to the VM) disks any more - if you look at the table I posted here you will see that in the Where column, I'm mostly using /tmp to work in i.e. completely inside the VM. Using an external disk might provoke the corruption earlier, but it's certainly not the only route to it (though later kernels seem quite a bit more stable).

hasan4791 · 2023-11-20T13:38:44Z

In my case it occurs with internal disk nd very frequent on fedora images. Just create fedora vm and do dnf update, corruption happens immediately.
btrfs scrub start /

EDIT: vz in my case

wdormann · 2023-11-20T14:14:52Z

Using an external disk might provoke the corruption earlier, but it's certainly not the only route to it (though later kernels seem quite a bit more stable).

I don't recall if I mentioned it here, but through eliminating variables I was able to pinpoint a configuration for a likely-to-corrupt-older-Linux-kernels situation, and that is having the VM hosted on an ExFAT-formatted partition (which just happens to be on an external disk for me). Based on how macOS/APFS works, I don't think it's even possible for me to test how ExFAT might perform on my internal disk. At least not without major reconfiguration of my system drive.

If others are able to reproduce the disk corruption without relying on ExFAT at the host level, that at least helps eliminate the ExFAT-layer possibility of where the problem lies. At least for me, I've been able to avoid the problem by reformatting my external disk to APFS, as that seems to tweak at least one of the required variables to see this bug happen. At least if the Linux kernel version is new enough.

At a conceptual level, it is indeed possible that Linux may be doing nothing wrong at all. In other words, it could be possible that Linux just happens to be unlucky enough to express the disk usage patterns that can trigger a bug that presents symptoms as a corrupted (BTRFS in my case) file system. But I suspect that being able to positively acknowledge the difference between a somewhat unlikely to see Linux data corruption bug and a bug at the macOS hypervisor / storage level is probably beyond my skill set.

wdormann · 2023-11-21T01:38:42Z

Ok, just to throw a wrench into the works, I did notice my FreeBSD VM eventually experiencing disk corruption, but only after about a day or so of running the stress test. As opposed to the minute or two that it takes for Linux to corrupt itself.

The same VM clone but running from an APFS filesystem seems fine:

mbentley · 2023-11-21T11:04:22Z

So it seems like there are a lot of references to people mentioning issues related to external disks and non-APFS filesystems. I am using the internal disk on my m2 mini with the default APFS filesystem and I've experienced disk corruption once but haven't specifically been able to force it to be reproduced but I haven't tried very hard to be honest but I did want to point out that maybe external disks and other filesystems may not be the specific cause but may just be easier to trigger compared to internal APFS.

I run Debian Bookworm and after repairing the filesystem with a fsck I did also upgrade my kernel from linux-image-cloud-arm64 6.1.55-1 to 6.5.3-1~bpo12+1 in backports.

afbjorklund · 2023-11-21T11:17:15Z

The above table also lists corrupting when running with qemu/hvf, so it might not even be unique to vz...

EdwardMoyse · 2023-11-22T09:41:49Z

It is not unique to vz, and it is not unique to external disks.

With Almalinux 9.2 + kernel 6.6.2-1 I just got corruption with sudo yum update -y

:-(

EdwardMoyse · 2023-11-22T10:03:03Z

Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.

Unfortunately for me lima is completely unusable at the moment, and so for the moment I'm giving up.

wpiekutowski · 2023-11-22T10:24:47Z

I can reproduce this with 2 methods: stress-ng --iomix 4 (for filesystems with data checksums) and parallel cp of big files and then sha256sum *. Details: utmapp/UTM#4840 (comment)

Are you able to reproduce this as well?

afbjorklund · 2023-11-22T12:13:41Z

Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.

It still seems to be unique to one operating system and one hardware architecture, though? Maybe even Apple's issue.

EdwardMoyse · 2023-11-22T13:18:45Z

Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.

It still seems to be unique to one operating system and one hardware architecture, though? Maybe even Apple's issue.

Sorry, yes. I was being very single-minded in my statement above! I will rephrase the title.

AkihiroSuda · 2023-11-22T14:00:36Z

The above table also lists corrupting when running with qemu/hvf, so it might not even be unique to vz...

This issue might be worth reporting to https://gitlab.com/qemu-project/qemu/-/issues too, if the issue is reproducible with bare QEMU (without using Lima)

wdormann · 2023-11-22T15:50:10Z

At the risk of further fragmentation of the discussion of this issue, but at the potential benefit of getting the right eyeballs, I've filed: https://gitlab.com/qemu-project/qemu/-/issues/1997

(i.e., yes this can be reproduced with QEMU, as opposed to the Apple Hypervisor Framework)

AkihiroSuda · 2023-11-24T04:01:04Z

This may fix the issue for vz:

vz: use DiskImageCachingModeCached (rumored to fix disk corruption on ARM) #2026

( Thanks to @wpiekutowski utmapp/UTM#4840 (comment) @wdormann utmapp/UTM#4840 (comment) )

EdwardMoyse · 2023-11-24T12:50:06Z

Oh wow - I've run my test twice with the patched version of lima and no corruption or crashes! From reading the ticket, it's more a workaround than a complete fix, but I'll happily take it! Thanks @AkihiroSuda

AkihiroSuda added status/more-info-needed expert guest/el9 component/vz labels Oct 27, 2023

This comment was marked as off-topic.

Sign in to view

AkihiroSuda changed the title ~~Apparent disk corruption with almalinux9~~ Apparent disk corruption with vz Oct 30, 2023

AkihiroSuda added the help wanted Extra attention is needed label Oct 30, 2023

wdormann mentioned this issue Oct 30, 2023

Apple Virtual machine keeps setting itself as read-only utmapp/UTM#4840

Open

AkihiroSuda added platform/ARM and removed guest/el9 labels Oct 31, 2023

AkihiroSuda mentioned this issue Oct 31, 2023

Roadmap for moving vz out of experimental #1610

Closed

6 tasks

AkihiroSuda mentioned this issue Nov 17, 2023

lima: MacOS.version >= ventura does not require qemu Homebrew/homebrew-core#154674

Closed

6 tasks

chancez mentioned this issue Nov 20, 2023

Use VZ by default for new instances on macOS >= 13.5 #1951

Merged

EdwardMoyse changed the title ~~Apparent disk corruption with vz~~ VM disk corruption Nov 22, 2023

EdwardMoyse changed the title ~~VM disk corruption~~ VM disk corruption with Apple Silicon Nov 22, 2023

AkihiroSuda mentioned this issue Nov 24, 2023

vz: use DiskImageCachingModeCached (rumored to fix disk corruption on ARM) #2026

Merged

amalchuk mentioned this issue Nov 30, 2023

Fix the filesystem corruption on Linux VMs cirruslabs/tart#675

Merged

AkihiroSuda removed the component/vz label Dec 1, 2023

EdwardMoyse mentioned this issue Dec 1, 2023

Ubuntu Jammy VMs freezing on an Apple M1 Pro specifically MacBookPro18,1 #2039

Open

cfergeau mentioned this issue Jan 9, 2024

virtio: Enable full disk caching crc-org/vfkit#76

Merged

AkihiroSuda closed this as completed Aug 30, 2024

VM disk corruption with Apple Silicon #1957

VM disk corruption with Apple Silicon #1957

Comments

EdwardMoyse commented Oct 27, 2023 • edited by AkihiroSuda Loading

Description

EdwardMoyse commented Oct 27, 2023

AkihiroSuda commented Oct 27, 2023

EdwardMoyse commented Oct 29, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

EdwardMoyse commented Oct 30, 2023 • edited Loading

AkihiroSuda commented Oct 30, 2023

afbjorklund commented Oct 30, 2023 • edited Loading

AkihiroSuda commented Oct 30, 2023

AkihiroSuda commented Oct 30, 2023

wdormann commented Oct 30, 2023 • edited Loading

AkihiroSuda commented Oct 30, 2023 • edited Loading

EdwardMoyse commented Oct 30, 2023

wdormann commented Oct 30, 2023

AkihiroSuda commented Oct 30, 2023 • edited Loading

AkihiroSuda commented Oct 30, 2023

EdwardMoyse commented Oct 31, 2023

wdormann commented Oct 31, 2023

EdwardMoyse commented Oct 31, 2023

wdormann commented Oct 31, 2023

EdwardMoyse commented Nov 15, 2023 • edited Loading

wdormann commented Nov 17, 2023

AkihiroSuda commented Nov 17, 2023

wdormann commented Nov 17, 2023 • edited Loading

afbjorklund commented Nov 19, 2023 • edited Loading

EdwardMoyse commented Nov 20, 2023

hasan4791 commented Nov 20, 2023 • edited Loading

wdormann commented Nov 20, 2023

wdormann commented Nov 21, 2023

mbentley commented Nov 21, 2023 • edited Loading

afbjorklund commented Nov 21, 2023

EdwardMoyse commented Nov 22, 2023

EdwardMoyse commented Nov 22, 2023

wpiekutowski commented Nov 22, 2023

afbjorklund commented Nov 22, 2023

EdwardMoyse commented Nov 22, 2023

AkihiroSuda commented Nov 22, 2023 • edited Loading

wdormann commented Nov 22, 2023

AkihiroSuda commented Nov 24, 2023

EdwardMoyse commented Nov 24, 2023

EdwardMoyse commented Oct 27, 2023 •

edited by AkihiroSuda

Loading

EdwardMoyse commented Oct 30, 2023 •

edited

Loading

afbjorklund commented Oct 30, 2023 •

edited

Loading

wdormann commented Oct 30, 2023 •

edited

Loading

AkihiroSuda commented Oct 30, 2023 •

edited

Loading

AkihiroSuda commented Oct 30, 2023 •

edited

Loading

EdwardMoyse commented Nov 15, 2023 •

edited

Loading

wdormann commented Nov 17, 2023 •

edited

Loading

afbjorklund commented Nov 19, 2023 •

edited

Loading

hasan4791 commented Nov 20, 2023 •

edited

Loading

mbentley commented Nov 21, 2023 •

edited

Loading

AkihiroSuda commented Nov 22, 2023 •

edited

Loading