Classify exit-1 errors & guard telemetry

Analyze logs for generic exit code 1 and export an ERROR_CATEGORY_OVERRIDE so telemetry receives a more accurate error category (apt, oom, network, storage, dependency). Preserve any existing TELEMETRY_TYPE when posting updates. Add defense-in-depth by disabling strict error traps before running grep/sed log analysis to avoid spurious error_handler invocations. Mark successful installs with INSTALL_COMPLETE and update the error handler to only report a successful "done" telemetry state when INSTALL_COMPLETE is explicitly set, preventing false-positive success reports from early zero-exit exits.
This commit is contained in:
CanbiZ (MickLesk)
2026-03-24 09:57:43 +01:00
parent 9aa0390553
commit 86c658909a
3 changed files with 104 additions and 54 deletions

View File

@@ -507,14 +507,23 @@ _stop_container_if_installing() {
on_exit() {
local exit_code=$?
# Report orphaned "installing" records to telemetry API
# Catches ALL exit paths: errors, signals, AND clean exits where
# post_to_api was called but post_update_to_api was never called
if [[ "${POST_TO_API_DONE:-}" == "true" && "${POST_UPDATE_DONE:-}" != "true" ]]; then
if [[ $exit_code -ne 0 ]]; then
_send_abort_telemetry "$exit_code"
elif declare -f post_update_to_api >/dev/null 2>&1; then
post_update_to_api "done" "0" 2>/dev/null || true
# Report orphaned telemetry records
# Two scenarios handled:
# 1. POST_TO_API_DONE=true but POST_UPDATE_DONE=false: Record was created but
# never got a final status update → send abort/done now.
# 2. POST_TO_API_DONE=false but DIAGNOSTICS=yes: Initial post failed (server
# unreachable/timeout), but the server has fallback create-on-update logic,
# so a status update can still create the record. Worth one last try.
if [[ "${POST_UPDATE_DONE:-}" != "true" ]]; then
if [[ "${POST_TO_API_DONE:-}" == "true" || "${DIAGNOSTICS:-no}" == "yes" ]]; then
if [[ $exit_code -ne 0 ]]; then
_send_abort_telemetry "$exit_code"
elif [[ "${INSTALL_COMPLETE:-}" == "true" ]] && declare -f post_update_to_api >/dev/null 2>&1; then
# Only report success if the install was explicitly marked complete.
# Without this guard, early bailouts (e.g. user cancelled) with exit 0
# would be falsely reported as successful installations.
post_update_to_api "done" "0" 2>/dev/null || true
fi
fi
fi