-
-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QEMU unable to handle NVME with 4k block size #1375
Comments
I spent a couple of hours trying to find a good way to handle this, in short, there isn't one... I tried a variety of QEMU options to try to have it consume a 4k block device on the host and pretend that it's 512 bytes aligned in the guest, but that doesn't appear to be working, possibly because of DirectIO. So I ended up having to go for the big hammer and put in a check which will cause image unpacks on non-512 bytes devices to fail. |
Closes lxc#1375 Signed-off-by: Stéphane Graber <[email protected]>
Closes #1375 Signed-off-by: Stéphane Graber <[email protected]>
@stgraber I happen to be trying to run Incus on an nvme drive that only has 4096-byte sector size as an option, 512 isn't shown in the nvme list command shown in the forum link above, so I can't easily configure it to use 512 sector sizes. I went down several rabbit holes, considered putting my LVM physical volume in a loop device pointing to the physical partition because I can set the sector size on a loop device to whatever I want, but wanted a real solution. I finally discovered that if I delete /usr/bin/sgdisk from my host system, grab a fresh image directly from the linux containers image registry, and spin up the VM it just magically works. Running gdisk inside the vm shows that even though the physical disks have 4096 sector size, the virtual disk (virtio-scsi at least) makes it appear in qemu as 512 sector size and all's happy: After more digging, it appears that this line is the culprit:
I haven't tried it yet, but it looks like one solution might be to use losetup to create a temporary loop device with 512 sector size, then have sgdisk act on the loop device, then delete the loop device might be the workflow needed to make this work. |
Update, yep this is definitely it. With deleting sgdisk completely, there is a warning during boot that the GPT partition table isn't quite right. If I instead move /usr/bin/sgdisk to /root/sgdisk (for testing) and put this shell script in /usr/bin/sgdisk instead: #!/bin/sh
LOOP_DEV=`losetup -P -b 512 -f $2 --show`
/root/sgdisk $1 $LOOP_DEV
losetup -d $LOOP_DEV Then create an instance from an image I do not currently have cached or instantiated, I get a perfect boot with no GPT errors with an LVM on a 4096-byte sector disk. Now I have absolutely no clue how to go about patching this into incus, unfortunately sgdisk does not have an option to forcibly override the sector size read from the disk for a single operation that I can find. |
Ah, interesting, we should investigate if we can't just force sgdisk into 512 bytes mode somehow. |
Issue description
Cant seem to boot any VM when NVME drives have a lbaf of 4096 bytes and im using the lvmcluster driver.
I have changed the lbaf of my drives to 512 bytes in order to test a similar issue that another use had on the forum (link), and suddenly VM's work.
The issue seems to be that QEMU may not know how to handle NVME drives with block size 4096 bytes, and when booting the VM it is unable to properly handle the image and can't find the Qemu hard drive it needs in order to boot. I have tried with secure boot turned off as well, still no go.
The problem seems to only affect VM's and not containers. Containers do not have an issue with either block size.
Steps to reproduce
vgcreate vg_name /dev/nvmeXnX --shared
I have tried looking at the qemu.log from the incus UI but it is empty as are the other logs under /var
System OS : Debian 12.8
Kernel : 6.1.27 amd64
Incus version: 6.0.2 LTS
Information to attach
(I have launch with --debug and incus info outputs below)
DEBUG
Incus info output
The text was updated successfully, but these errors were encountered: