mirror of
https://github.com/MacRimi/ProxMenux.git
synced 2026-06-03 13:54:41 +00:00
health_persistence: stop leaking obs counts across NVMe device renames
`get_disks_observation_counts` maps each serial's count to that serial's "most recent" device_name (so renames like ata8 -> sdh keep the badge attached). When several physical disks have passed through the same kernel name across reboots — common with NVMe, the kernel probes in a different order depending on which slots are populated — disk_registry keeps a row per (device_name, serial) seen and the "most recent" device_name for a serial can now be in use by an entirely different disk. Concrete case from the wild: serial 211716800490 was nvme0n1 during the previous boot and earned a real I/O observation. After removing four of five NVMes, the surviving disk (serial 243332800236) booted into nvme0n1. The badge layer mirrored 211716800490's count onto nvme0n1 — which is now a different physical disk — and showed "1 obs." on the wrong drive, while the modal (which scopes by the current (device_name, serial) registry row) found nothing and rendered an empty history. Only mirror a serial's count onto its device_name when that device_name is currently owned by the same serial, determined from the freshest disk_registry row. The serial-keyed entry stays unconditional so observations remain reachable when the disk is re-plugged under another device name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -2592,12 +2592,34 @@ class HealthPersistence:
|
||||
if serial not in serial_to_device:
|
||||
serial_to_device[serial] = device_name
|
||||
|
||||
# Resolve which serial currently OWNS each device_name. The
|
||||
# kernel reuses NVMe / SD device names across reboots
|
||||
# (e.g. the disk that was nvme4n1 with 5 NVMes plugged in
|
||||
# comes back as nvme0n1 once 4 are removed), and
|
||||
# disk_registry keeps a row for every (device_name, serial)
|
||||
# combination it has ever seen. Without this check we would
|
||||
# mirror an observation's count onto its serial's
|
||||
# "most-recent" device_name even when that name is now in
|
||||
# use by a DIFFERENT serial — surfacing a "1 obs." badge on
|
||||
# a disk that has no observations of its own and a clean
|
||||
# modal, since the modal correctly scopes by current
|
||||
# (device_name, serial) pair.
|
||||
cursor.execute('''
|
||||
SELECT device_name, serial FROM disk_registry
|
||||
WHERE device_name IS NOT NULL AND device_name != ''
|
||||
ORDER BY last_seen DESC
|
||||
''')
|
||||
current_owner = {}
|
||||
for device_name, dev_serial in cursor.fetchall():
|
||||
if device_name not in current_owner:
|
||||
current_owner[device_name] = dev_serial
|
||||
|
||||
# Build result
|
||||
result = {}
|
||||
for serial, cnt in serial_counts.items():
|
||||
result[f'serial:{serial}'] = cnt
|
||||
device_name = serial_to_device.get(serial)
|
||||
if device_name:
|
||||
if device_name and current_owner.get(device_name) == serial:
|
||||
result[device_name] = max(result.get(device_name, 0), cnt)
|
||||
|
||||
# For disks WITHOUT serial: group by device_name
|
||||
|
||||
Reference in New Issue
Block a user