Flexera One - IT Asset Management - EU - Service Disruption

Incident Report for Flexera System Status Dashboard

Postmortem

Description: Flexera One – IT Asset Management – EU – Service Disruption

Timeframe: July 23, 2025, 1:07 PM to 1:38 PM PDT

Incident Summary

On July 23, 2025, at 1:07 PM PDT, Flexera One IT Asset Management in the Europe region experienced a service disruption. During this period, customers may have been unable to log in or access key workflows within IT Asset Management.

The disruption occurred shortly after a release window. While initial access was confirmed as available, the underlying service components encountered readiness issues, leading to both servers being marked unavailable. Automated recovery mechanisms initiated replacement servers, restoring access by 1:38 PM PDT.

Comprehensive checks confirmed full functionality following restoration, and no direct customer reports were received during the incident.

Root Cause

Primary Root Cause:

The disruption was triggered by a readiness failure in the service components responsible for handling customer access. The affected components did not transition to a fully operational state, which caused temporary unavailability until recovery mechanisms provisioned new healthy servers.

Contributing Factors:

• Timing of Health Readiness: The service health checks may not have allowed sufficient time for new servers to become fully operational after the release window.
• Startup Process Completion: A multi step initialization process may not have fully completed, preventing servers from passing readiness checks.
• Simultaneous Failures: Both servers failed around the same time, amplifying the impact and leaving no fallback available until recovery mechanisms were triggered.

Remediation Actions

  1. Automated Recovery: Replacement servers were automatically provisioned, restoring service availability.
  2. Post Recovery Validation: Technical teams conducted thorough checks to ensure IT Asset Management functionality was fully restored.
  3. Readiness Timing Adjustments: As an immediate measure, adjustments are being made to increase the readiness grace period, reducing the risk of premature service failures in the future.

Future Preventative Measures

Following this incident, a detailed root cause analysis and internal retrospective were completed to identify areas for long term improvement. The following workstreams were initiated under a platform reliability initiative aimed at strengthening performance and minimizing the risk of recurrence. While the underlying cause of the service disruption remains under investigation, the actions already implemented and those underway are expected to significantly improve the platform’s ability to detect, respond to, and recover from similar events in the future.

  1. Readiness Window Adjustments: Increasing the grace period for readiness checks to ensure new servers have sufficient time to fully initialize.
  2. Enhanced Startup Visibility: Work is underway to introduce step by step tracking of the initialization process, allowing teams to better identify and address bottlenecks or delays.
  3. System Readiness Optimization: As part of ongoing reliability initiatives, options are being explored to leverage advanced automation tools for improved visibility and control of the startup process, helping reduce the likelihood of similar readiness issues.
Posted Aug 01, 2025 - 16:08 PDT

Resolved

Incident Description: Our teams identified an issue that caused a disruption impacting IT Asset Management in the EU region. As a result, IT Asset Management may have been inaccessible for the duration of the incident.

Priority: P1

Impact Start: July 23, 2025, 1:07 PM PDT
Impact End: July 23, 2025, 1:38 PM PDT
Impact Duration: 31 minutes

Restoration Activity: The disruption was caused by a failure in the underlying service supporting IT Asset Management in the EU region. This resulted in temporary inaccessibility of the application for affected users. Recovery mechanisms were triggered, and normal access was restored shortly after. This incident has been resolved.

We will be conducting a full retrospective and sharing a post-mortem report in the coming days.
Posted Jul 23, 2025 - 15:54 PDT
This incident affected: Flexera One - IT Asset Management - Europe (IT Asset Management - EU Beacon Communication, IT Asset Management - EU Inventory Upload, IT Asset Management - EU Login Page, IT Asset Management - EU Batch Processing System, IT Asset Management - EU Business Reporting).