Flexera One - IT Asset Management - EU - UI Was Inaccessible

Incident Report for Flexera System Status Dashboard

Postmortem

Description: Flexera One - IT Asset Management - EU - UI Was Inaccessible

Timeframe: September 25th, 4:38 AM to September 25th, 5:36 AM PDT

Incident Summary

On September 25th, at 4:38 AM PDT, we experienced a service disruption that potentially affected customer access to the IT Asset Management UI. This incident primarily impacted customers in the EU region, while customers in other regions remained unaffected.

At 5:14 AM PDT, our technical team pinpointed the issue to an error during the deployment process. A critical problem emerged due to an unintentional configuration change, impacting a specific subset of servers within our environment.

In response to this situation, new servers were promptly introduced at 5:36 AM PDT to restore service. Subsequently, our technical team implemented system enhancements to improve the reliability and accuracy of server configurations, reducing the risk of similar incidents in the future.

Following this change, our technical team conducted additional health checks and continuous monitoring, confirming the successful resolution of the incident.

Root Cause

The root cause of this incident was an unintentional configuration change during the deployment process. This configuration error impacted a specific subset of servers within our environment, resulting in service disruption.

Remediation Actions

  1. Issue Identification: Promptly identified the issue at 5:14 AM PDT as an error during the deployment process.
  2. Server Introduction: Introduced new servers at 5:36 AM PDT to restore service promptly, addressing the immediate disruption.
  3. Health Checks and Continuous Monitoring: Conducted additional health checks and continuous monitoring to confirm the successful resolution of the incident and maintain ongoing system stability.

Future Preventative Measures

  1. System Enhancements: Implemented system enhancements to ensure the reliability and accuracy of server configurations to mitigate future risks.
  2. Enhanced Change Management Procedures: Implement more robust change management procedures with an emphasis on validation steps to prevent inadvertent configuration errors during deployments.
Posted Oct 27, 2023 - 11:55 PDT

Resolved

This incident has been resolved.
Posted Sep 25, 2023 - 06:56 PDT

Update

The UI servers had undergone scaling, but they were not in a healthy state. We pinpointed the issue, applied the required fixes, and introduced new servers to address the situation.
Posted Sep 25, 2023 - 06:55 PDT

Investigating

Incident Description: We encountered a service disruption earlier today, resulting in potential customer inaccessibility to the IT Asset Management UI. This incident specifically impacted customers located in the EU region, while customers in other regions remained unaffected.

Priority: P1

Impact Start: September 25th, 2023, 4:44 AM PDT
Impact End: September 25th, 2023, 5:36 AM PDT

Restoration Activity: Our technical team promptly identified and resolved the issue, successfully restoring the service to its normal state. Additionally, comprehensive health checks have been conducted and have verified the system's stability.
Posted Sep 25, 2023 - 06:39 PDT
This incident affected: Flexera One - IT Asset Management - Europe (IT Asset Management - EU Beacon Communication, IT Asset Management - EU Inventory Upload, IT Asset Management - EU Login Page, IT Asset Management - EU Batch Processing System, IT Asset Management - EU Business Reporting).