Proxmox VE的ZFS(系统启动盘)如何更换故障盘
之前写过一篇PVE ZFS更换故障盘的文章“Proxmox VE的ZFS如何更换故障盘”,有个大佬点评:
其实你这样替换会有些问题,因为是用一个新盘的整盘替换了一个旧盘的分区。随着不停替换,以后会影响启动的。目前看没问题。
我挺在意这个问题,所以去查了一下,在Reddit和一些论坛上确实有人分享这个经验,和大佬点评的一样:
如果像我之前那样直接把整盘当作新设备加入,而 ZFS 最初 pool 是以 “分区 (partition)” 作为 member(例如
p3)的话,可能造成分区结构和 GUID / 分区表不一致,长期累积多次替换,可能会影响启动 (boot),尤其是当 pool 包含系统盘
(boot pool / rpool) 的情况下。也有人说如果新盘没有明确 copy 分区表 (partition layout) + 重建 boot 分区 (EFI/BIOS) + 安装
boot loader (如 grub 或 proxmox-boot-tool),系统可能无法从新盘启动。
可以看到,这些问题都是基于将ZFS作为系统盘启动会遇到的,如果你不是将ZFS作为系统盘,而是作为数据盘则我原文是没问题的。但如果将ZFS作为系统盘,则建议使用下文方法更换故障盘,避免潜在风险。在此谢谢大佬指正!
本文以/dev/nvme2n1故障为例,其实也不能说是故障,这个设备不支持3个PCIE 4.0的U.2盘,装上容易掉盘,当前虽然lsscsi显示了盘,实际上是不可用的,可以看到zfs status中有个盘状态是REMOVED。
root@usa-zfs-amd-23:~# lsscsi
[16:0:0:0] cd/dvd AMI Virtual CDROM0 1.00 /dev/sr0
[N:0:0:1] disk INTEL SSDPF2KX038XZ__1 /dev/nvme0n1
[N:1:0:1] disk INTEL SSDPF2KX038XZ__1 /dev/nvme1n1
[N:2:0:1] disk INTEL SSDPF2KX038XZ__1 /dev/nvme2n1
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Jun 12 11:05 pci-0000:04:00.3-usb-0:2.1:1.0-scsi-0:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 13 Jun 12 11:05 pci-0000:41:00.0-nvme-1 -> ../../nvme2n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:41:00.0-nvme-1-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:41:00.0-nvme-1-part2 -> ../../nvme2n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:41:00.0-nvme-1-part3 -> ../../nvme2n1p3
lrwxrwxrwx 1 root root 13 Jun 12 11:05 pci-0000:82:00.0-nvme-1 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:82:00.0-nvme-1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:82:00.0-nvme-1-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:82:00.0-nvme-1-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 13 Jun 12 11:05 pci-0000:83:00.0-nvme-1 -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:83:00.0-nvme-1-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:83:00.0-nvme-1-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 pci-0000:83:00.0-nvme-1-part3 -> ../../nvme1n1p3
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~# zpool status rpool
pool: rpool
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using zpool online' or replace the device with
'zpool replace'.
scan: resilvered 495G in 00:29:49 with 0 errors on Thu Jun 12 11:05:33 2025
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
nvme-eui.01000000000000005cd2e462c4645651-part3 ONLINE 0 0 0
nvme-eui.01000000000000005cd2e4e4c3645651-part3 ONLINE 0 0 0
nvme-eui.01000000000000005cd2e47dd4695651-part3 REMOVED 0 0 0
errors: No known data errors
root@usa-zfs-amd-23:~#
更换新盘后,查看确认一下
root@usa-zfs-amd-23:~# ls -la /dev/disk/by-id
total 0
drwxr-xr-x 2 root root 600 Dec 6 07:41 .
drwxr-xr-x 9 root root 180 Jun 12 10:41 ..
lrwxrwxrwx 1 root root 13 Dec 6 07:41 nvme-eui.000000000000000100a0750127b4e0b2 -> ../../nvme2n1
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-eui.01000000000000005cd2e462c4645651 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e462c4645651-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e462c4645651-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e462c4645651-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-eui.01000000000000005cd2e4e4c3645651 -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e4e4c3645651-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e4e4c3645651-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-eui.01000000000000005cd2e4e4c3645651-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN -> ../../nvme1n1
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN_1 -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN_1-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN_1-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN_1-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO3482049K3P8UGN-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN -> ../../nvme0n1
lrwxrwxrwx 1 root root 13 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN_1 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN_1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN_1-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN_1-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Jun 12 11:05 nvme-INTEL_SSDPF2KX038XZ_PHAO348204D93P8UGN-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 13 Dec 6 07:41 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2 -> ../../nvme2n1
lrwxrwxrwx 1 root root 13 Dec 6 07:41 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2_1 -> ../../nvme2n1
lrwxrwxrwx 1 root root 9 Jun 12 11:05 usb-AMI_Virtual_CDROM0_AAAABBBBCCCC1-0:0 -> ../../sr0
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~# lsscsi
[16:0:0:0] cd/dvd AMI Virtual CDROM0 1.00 /dev/sr0
[N:0:0:1] disk INTEL SSDPF2KX038XZ__1 /dev/nvme0n1
[N:1:0:1] disk INTEL SSDPF2KX038XZ__1 /dev/nvme1n1
[N:2:1:1] disk Micron_9300_MTFDHAL3T8TDP__1 /dev/nvme2n1
已经正常识别了,接下来查看下分区,可以看到新盘/dev/nvme2n1是没有分区的,但其他正常盘例如/dev/nvme1n1是有分区的
root@usa-zfs-amd-23:~# lsblk /dev/nvme2n1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme2n1 259:0 0 3.5T 0 disk
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~# lsblk /dev/nvme1n1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme1n1 259:5 0 3.5T 0 disk
├─nvme1n1p1 259:6 0 1007K 0 part
├─nvme1n1p2 259:7 0 1G 0 part
└─nvme1n1p3 259:8 0 3.5T 0 part
现在我们基于Proxmox官方推荐的方式开始替换硬盘
① 复制旧盘分区表到新盘
先从正常可用的盘(例如 /dev/nvme0n1)复制分区表到新盘
root@usa-zfs-amd-23:~# sgdisk --replicate=/dev/nvme2n1 /dev/nvme0n1
The operation has completed successfully.
随机化 GUID
root@usa-zfs-amd-23:~# sgdisk --randomize-guids /dev/nvme2n1
The operation has completed successfully.
检查确认,其结构应该与 /dev/nvme0n1 一致
root@usa-zfs-amd-23:~# lsblk /dev/nvme2n1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme2n1 259:0 0 3.5T 0 disk
├─nvme2n1p1 259:1 0 1007K 0 part
├─nvme2n1p2 259:2 0 1G 0 part
└─nvme2n1p3 259:3 0 3.5T 0 part
② 初始化 boot 分区(p1/p2)以便可引导
Proxmox 使用 proxmox-boot-tool 来管理 ESP。
格式化 ESP(p2分区通常是EFI,从容量也可以看出来)
root@usa-zfs-amd-23:~# proxmox-boot-tool format /dev/nvme2n1p2
UUID="" SIZE="1073741824" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="nvme2n1" MOUNTPOINT=""
Formatting '/dev/nvme2n1p2' as vfat..
mkfs.fat 4.2 (2021-01-31)
Done.
初始化 bootloader
root@usa-zfs-amd-23:~# proxmox-boot-tool init /dev/nvme2n1p2
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="096E-9317" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="nvme2n1" MOUNTPOINT=""
Mounting '/dev/nvme2n1p2' on '/var/tmp/espmounts/096E-9317'.
Installing systemd-boot..
Created "/var/tmp/espmounts/096E-9317/EFI/systemd".
Created "/var/tmp/espmounts/096E-9317/EFI/BOOT".
Created "/var/tmp/espmounts/096E-9317/loader".
Created "/var/tmp/espmounts/096E-9317/loader/entries".
Created "/var/tmp/espmounts/096E-9317/EFI/Linux".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/096E-9317/EFI/systemd/systemd-bootx64.efi".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/096E-9317/EFI/BOOT/BOOTX64.EFI".
Random seed file /var/tmp/espmounts/096E-9317/loader/random-seed successfully written (32 bytes).
Created EFI boot entry "Linux Boot Manager".
Configuring systemd-boot..
Unmounting '/dev/nvme2n1p2'.
Adding '/dev/nvme2n1p2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Copying and configuring kernels on /dev/disk/by-uuid/096E-9317
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
Copying and configuring kernels on /dev/disk/by-uuid/59AD-8721
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
Copying and configuring kernels on /dev/disk/by-uuid/59F7-7114
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
WARN: /dev/disk/by-uuid/59F7-DB10 does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
确认
root@usa-zfs-amd-23:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
096E-9317 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
59AD-8721 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
59F7-7114 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
WARN: /dev/disk/by-uuid/59F7-DB10 does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
这是正确的,出现1条 WARN 只是说明有一个旧 UUID(59F7-DB10)在 /etc/kernel/proxmox-boot-uuids 中,但现在找不到对应设备,可以清理它(可选,但建议):
root@usa-zfs-amd-23:~# cat /etc/kernel/proxmox-boot-uuids
096E-9317
59AD-8721
59F7-7114
59F7-DB10
正常应该一个盘一个uuid,可以看到多出一个。通过以下命令删除及刷新配置:
root@usa-zfs-amd-23:~# grep -v '59F7-DB10' /etc/kernel/proxmox-boot-uuids > /tmp/uuids && mv /tmp/uuids /etc/kernel/proxmox-boot-uuids
root@usa-zfs-amd-23:~# proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/096E-9317
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
Copying and configuring kernels on /dev/disk/by-uuid/59AD-8721
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
Copying and configuring kernels on /dev/disk/by-uuid/59F7-7114
Copying kernel and creating boot-entry for 6.8.12-4-pve
Copying kernel and creating boot-entry for 6.8.12-9-pve
现在重新查看status,已经没有警告了:
root@usa-zfs-amd-23:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
096E-9317 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
59AD-8721 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
59F7-7114 is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-9-pve)
③ ZFS 用新盘替换旧盘(使用分区,不是整盘)
找到新旧盘p3分区信息,留意其中新盘的p3分区信息:
root@usa-zfs-amd-23:~# ls -la /dev/disk/by-id | grep nvme2n1
lrwxrwxrwx 1 root root 13 Dec 6 13:34 nvme-eui.000000000000000100a0750127b4e0b2 -> ../../nvme2n1
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-eui.000000000000000100a0750127b4e0b2-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Dec 6 13:35 nvme-eui.000000000000000100a0750127b4e0b2-part2 -> ../../nvme2n1p2
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-eui.000000000000000100a0750127b4e0b2-part3 -> ../../nvme2n1p3
lrwxrwxrwx 1 root root 13 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2 -> ../../nvme2n1
lrwxrwxrwx 1 root root 13 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2_1 -> ../../nvme2n1
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2_1-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Dec 6 13:35 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2_1-part2 -> ../../nvme2n1p2
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2_1-part3 -> ../../nvme2n1p3
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Dec 6 13:35 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2-part2 -> ../../nvme2n1p2
lrwxrwxrwx 1 root root 15 Dec 6 13:34 nvme-Micron_9300_MTFDHAL3T8TDP_201627B4E0B2-part3 -> ../../nvme2n1p3
root@usa-zfs-amd-23:~#
root@usa-zfs-amd-23:~# zpool status rpool
pool: rpool
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using zpool online' or replace the device with
'zpool replace'.
scan: resilvered 495G in 00:29:49 with 0 errors on Thu Jun 12 11:05:33 2025
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
nvme-eui.01000000000000005cd2e462c4645651-part3 ONLINE 0 0 0
nvme-eui.01000000000000005cd2e4e4c3645651-part3 ONLINE 0 0 0
nvme-eui.01000000000000005cd2e47dd4695651-part3 REMOVED 0 0 0
errors: No known data errors
开始用新盘p3分区替换旧盘p3分区
root@usa-zfs-amd-23:~# zpool replace rpool nvme-eui.01000000000000005cd2e47dd4695651-part3 /dev/disk/by-id/nvme-eui.000000000000000100a0750127b4e0b2-part3
查看替换状态
root@usa-zfs-amd-23:~# zpool status rpool
pool: rpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sat Dec 6 13:46:10 2025
186G / 3.56T scanned at 5.18G/s, 0B / 3.55T issued
0B resilvered, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
nvme-eui.01000000000000005cd2e462c4645651-part3 ONLINE 0 0 0
nvme-eui.01000000000000005cd2e4e4c3645651-part3 ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
nvme-eui.01000000000000005cd2e47dd4695651-part3 REMOVED 0 0 0
nvme-eui.000000000000000100a0750127b4e0b2-part3 ONLINE 0 0 0
errors: No known data errors
现在只要等待就可以,完成后就是正常ONLINE状态。