Description: Flexera One - IT Asset Management-NAM- Service Disruption
Timeframe: July 29, 2025, 9:01 PM PDT to July 30, 2025, 12:48 AM PDT
Incident Summary
On Tuesday, July 29, 2025, at 9:01 PM PDT, we experienced a service disruption impacting access to Flexera One – IT Asset Management (ITAM) in the North America Production environment. The disruption occurred during an upgrade to version 2025R1.1, which required an extended maintenance window. During the incident, customers were unable to access the ITAM section of the platform, and all ITAM features were unavailable via the web UI.
The issue began when a long-running import task needed to be manually terminated before the upgrade could proceed. This delay was followed by additional deployment issues, possibly linked to the initial holdup. While attempting to bring services back online, the initial provisioning attempt failed due to unknown causes. Ultimately, our teams terminated the affected fleet and redeployed the infrastructure, which successfully restored service availability at 12:48 AM PDT on July 30, 2025.
Root Cause
The service disruption was caused by a combination of factors:
Remediation Actions
· Manually terminated the long-running import task to allow the upgrade to proceed.
· Initiated a manual refresh of instances to restore partial service availability.
· Terminated the affected fleet and redeployed the infrastructure to restore full service availability.
· Verified that all web servers were back online and passing health checks before concluding the incident.
Future Preventative Measures
· Root Cause Investigation – Conduct a deeper analysis of the provisioning failure to identify and address the underlying cause.
· Deployment Resilience – Enhance deployment processes to handle provisioning failures gracefully without full redeployment. Although the process exists, it encountered an issue in this instance.
· Improved Maintenance Window Planning – Incorporate contingency buffers into maintenance schedules to account for unexpected delays without impacting service availability. We will also consider scheduling longer windows for batch tasks.