(da0:mpt0:0:0:0): UNMAP. CDB: 42 00 00 00 00 00 00 02 68 00 (da0:mpt0:0:0:0): CAM status: SCSI Status Error (da0:mpt0:0:0:0): SCSI status: Busy (da0:mpt0:0:0:0): Retrying command
After a bunch of Google hits (mostly FreeNAS users) that didn't totally add up for me, I have a theory on the actual cause of the issue. In short, I think that this issue is caused by the presence of the following conditions.
- VMWare vSphere 6.5 host, which supports both VMFS5 and VMFS6.
- A FreeBSD guest, using...
- A virtual disk that is thinly-provisioned, and stored on a VMFS5 filesystem.
VMFS6 supports the use of the UNMAP command, which allows the guest operating system to inform the hypervisor that it is no longer using a block. When the virtual disk is thin-provisioned, the host can reallocate the block to the pool of available disk space. FreeBSD has included support for this SCSI command for some time.
My theory is this. I think ESXi is lying to the guests. I think that when a thin guest is created on a VMFS5 filesystem, the UNMAP command is still exposed/permitted, even though the underlying filesystem doesn't actually support it. When the guest tries to send the UNMAP command, it gets a bogus response. In the case of FreeBSD, it retries the command perpetually, hanging up the system.
The search hits I found (linked above) mention switching the virtual disk to a SATA/IDE bus as a workaround. I suspect that this works because the UNMAP command does not exist on those buses, preventing the issue from occurring. I believe that the following solutions are less hacky.
- When creating virtual disks for FreeBSD on VMFS5 filesystems, they should always be thick provisioned (I use eager zeroing, I haven't tested lazy). This seems to be the one-size-fits-all solution. It's also worth noting that the ESXi installer uses VMFS5 on the system disk, with no apparent way to use VMFS6.
- If you must use thin-provisioning, make sure that it is on a VMFS6 filesystem. I have not tested this extensively, but it seems to work.
- Thin-provisioned guests also seem to behave normally on NFS-backed storage.
In my testing, I have not had any issues on guests provisioned per #1.
I'm having this problem. The thing is, though, my system has a thinly-provisioned disk on VMFS6 and I'm still seeing it. I'm going to try to switch to SATA to see what happens...
ReplyDelete