Two regressions surfaced after the 1.2.2 release merge to main, both
in workflows that auto-trigger on push to main:
* Deploy to GitHub Pages — build failed with `pagefind: not found`
(exit 127) after Next.js prerendered all 241 routes. pagefind was
not declared in web/package.json; the local build only worked
because the project root had its own package.json with pagefind
as a devDep (the one we just gitignored in the previous commit).
Add `pagefind: ^1.5.2` to web/package.json devDependencies and
regenerate web/package-lock.json so `npm ci` in CI puts the
binary at web/node_modules/.bin/pagefind.
* Build ProxMenux Monitor AppImage — failed at the first step with
`mkdir: cannot create directory '/var/cache/proxmenux-build':
Permission denied`. The cache path was hardcoded to /var/cache/,
which is writable when the script runs as root (the .50 host
manual build) but not as the unprivileged GitHub Actions runner.
Switch to `${XDG_CACHE_HOME:-$HOME/.cache}/proxmenux-build/` —
works identically in both environments.
Verified locally: `cd web && npm ci && npm run build` produces 2804
files in out/, 231 pages indexed by pagefind, root redirect intact.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Promote the v1.2.1.x beta cycle to stable: version markers bumped
from 1.2.1.4-beta to 1.2.2 across version.txt, AppImage/package.json,
flask_server.py (3 places) and the four UI labels in login,
proxmox-dashboard, storage-overview and release-notes-modal.
Replace AppImage/ProxMenux-1.2.1.4-beta.AppImage with
ProxMenux-1.2.2.AppImage and regenerate the .sha256 sidecar
(097e2344675d4b21f1dd18c531c956c299a6507fbc3d0c9695418063581ba2b0).
The new binary is verified on all 4 lab hosts (.50 / .55 / .89 /
1.10) — same sha, all services active, runtime version markers
report 1.2.2.
CHANGELOG["1.2.2"] in release-notes-modal.tsx consolidates every beta
in the 1.2.1.x line (12 added / 13 changed / 18 fixed), and
CURRENT_VERSION_FEATURES is rewritten with the four stable highlights:
Health Monitor Thresholds, granular dismiss control (per-event
duration + Active Suppressions panel), Apprise notification channel
parity, and LXC update detection.
Pedro Rico, 19/05: after reinstalling the Monitor from GitHub a real
SSH/web login failure went unnotified. Root cause was the auth_fail
cooldown surviving across the service restart — install_proxmenux_beta
extracts the new AppImage but leaves the notification_last_sent SQLite
table intact (desirable: we don't want to lose legitimate cooldowns
on every update). On startup `_load_cooldowns_from_db()` then loaded
the stale auth_fail row from the previous run into the in-memory
cache, and `_passes_cooldown` blocked the new event.
This extends the existing reset-on-start mechanism (already in place
for update_summary, proxmenux_update, post_install_update, …) to also
clear auth_fail rows. A security-relevant event shouldn't be silenced
because the same source IP happened to fail to log in yesterday.
- Rename `_UPDATE_EVENT_TYPES_RESET_ON_START` → `_EVENT_TYPES_RESET_ON_START`
(the list no longer covers only update-status reports).
- Rename `_reset_update_cooldowns_on_start()` → `_reset_cooldowns_on_start()`
for the same reason.
- Add `'auth_fail'` to the curated list.
High-frequency sources (log_critical_*, disk SMART errors, …) are
deliberately NOT on this list — they keep their 24h cooldown across
restarts to prevent inbox floods if the user toggles the service.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the user reinstalls or restarts the Monitor (deploy of a new
beta AppImage), they expect to see a fresh "what's available now"
summary in Telegram/Gotify/etc. instead of silence — even if the
24h anti-spam cooldown for `update_summary` etc. hasn't expired yet.
Without this, the operator had to wait up to 24h after every
deploy before the next `update_summary`, `proxmenux_update`,
`post_install_update`, `pve_update`, `update_available`,
`nvidia_driver_update_available` or `secure_gateway_update_available`
notification fired. The 24h cooldown is the right default for steady
state (don't pester the user every poll cycle with the same "177
packages pending" reminder), but a service restart is an explicit
signal that the user wants a fresh status report.
- New _UPDATE_EVENT_TYPES_RESET_ON_START tuple lists the event types
to clear (everything in the "*_update*" + "update_*" family).
- New _reset_update_cooldowns_on_start() runs at start() right after
the running flag flips, before watchers/dispatcher come up.
- Patterns match both fingerprint shapes:
"<host>:<entity>:<event_type>:" trailing-colon form
"<host>:<entity>:<event_type>" no-suffix form (managed installs)
- In-memory `_cooldowns` cache is also pruned so the live dispatcher
picks up the reset immediately, without waiting for the next
`_load_cooldowns_from_db()` cycle.
Non-update cooldowns (auth_fail, log_critical_*, disk errors, …) are
preserved so a restart doesn't unleash a backlog of stale alerts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_dispatch_to_channels` does NOT receive the NotificationEvent object —
only the rendered primitives (title, body, severity, event_type, …).
The Quiet Hours + Daily Digest merge introduced two references to
`event.severity` / `event` inside this function, which raised
`NameError: name 'event' is not defined` for every event passing
through dispatch.
The dispatch loop swallows the exception with a broad `except`, so the
visible symptom was "the Test button works but no real event ever
arrives" — both for community beta users (multiple reports on
Telegram, 9-18 May) and verified live on a test host (id 905 in
notification_history confirms the pipeline post-fix).
- _dispatch_to_channels: read `severity` / `event_type` directly
instead of `event.severity` / `event.event_type`.
- _should_buffer_for_digest: take (ch_name, severity, event_type)
primitives instead of a NotificationEvent.
- _buffer_digest_event: same — take (ch_name, event_type,
event_group, severity, title, body).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>