Flexera One - ITAM - EU - WebUI Access/Login Issues

Incident Report for Flexera System Status Dashboard

Postmortem

Description: Flexera One - ITAM - EU - WebUI Access/Login Issues

Timeframe: December 9, 2025, 12:57 AM PST to December 9, 2025, 1:08 AM

Incident Summary

 

On December 9, 2025, at approximately 12:57 AM, monitoring systems detected a disruption impacting access to the Web UI for IT Asset Management in the EU region. During the incident window, affected users encountered login error messages when attempting to access the service.

Initial triage confirmed that the issue was isolated to the login layer, while all downstream and backend services remained fully operational. Further investigation identified repeated failures of liveness and readiness probes associated with the authentication service. These probe failures caused the relevant service components to restart continuously, preventing successful login requests.

The disruption lasted approximately 11 minutes and auto-resolved once the restarts stabilized and the services returned to a healthy state. Although the incident self-recovered, our teams continued root cause analysis to understand the underlying load conditions that triggered the probe failures and to prevent recurrence.

Root Cause

 

Initial diagnostics indicated issues related Service components supporting authentication repeatedly failed liveness and readiness health checks due to sustained high CPU and load conditions. This resulted in continuous pod restarts, preventing successful login processing during the incident window.

 

Contributing Factors:

 

High Load Bursts

  • A recurring high-load pattern was observed coinciding with alert timestamps.
  • Load was likely driven by dependent services making frequent calls.

Insufficient Resource Buffer

  • CPU utilization exceeded safe operating thresholds.
  • Service components continued to crash until load subsided and services stabilized.

Remediation Actions

 

·        Automatic Recovery - The platform’s self-healing mechanisms resolved the issue without the need for manual intervention.

·        Service Stabilization - Affected service components stabilized once system load subsided, and restart activity ceased.

·        Authentication Health Restored - Authentication services returned to a healthy and stable operating state.

·        Full Service Availability Confirmed - Login functionality was fully restored and verified by 1:08 AM.

Future Preventative Measures

 

·        Proactive Load & Failure Analysis - Conduct detailed analysis of service logs and metrics to identify recurring load patterns and conditions that lead to health-check probe failures, enabling earlier detection of risk scenarios.

·        Health Check Optimization - Review and tune liveness and readiness probe configurations, including timeout and threshold values, to ensure services can tolerate short-lived load spikes without unnecessary restarts.

·        Capacity & Resource Buffer Enhancement - Increase CPU and memory allocations for authentication-related services to provide sufficient headroom during sudden traffic surges and prevent resource exhaustion.

Posted Jan 20, 2026 - 05:12 PST

Resolved

Incident Description: Earlier today, our teams detected an issue impacting access to the Web UI for IT Asset Management in the EU region. Affected users may have encountered login error messages when attempting to access the service.

Priority: P2

Restoration Activity: The impact lasted approximately 11 minutes before the issue auto-resolved. Our technical teams continue to investigate the underlying cause. Initial findings indicate that all backend services remained fully operational, and the disruption was isolated to the login servers.
Posted Dec 09, 2025 - 01:43 PST
This incident affected: Flexera One - IT Asset Management - Europe (IT Asset Management - EU Login Page).