Flexera One - IT Asset Management - NA - Inventory Data Delayed

Incident Report for Flexera System Status Dashboard

Postmortem

Description: Flexera One - IT Asset Management - NA - Inventory Data Delayed

Timeframe: March 25, 2025, 12:00 PM PST to March 28, 2025, 5:30 AM PDT

Incident Summary

On Tuesday, March 25, 2025, at 12:00 PM PST, our teams identified an issue affecting Inventory data uploads in the NAM region. This disruption led to a backlog in the processing of IT Asset Management (ITAM) services across North America (NA), resulting in delays for customers in accessing up-to-date inventory data on the platform. One customer formally reported the issue of outdated inventory data through a support case, which initiated an active investigation by our engineering and support teams.

While a similar situation was noted in the EU environment, the backlog levels there remained minimal and within acceptable operating thresholds. Continuous monitoring confirmed that EU operations were stable, with no reported impact on customers in that region.

The investigation indicated that five customers were primarily affected, with two of them significantly contributing to the backlog. The backlog arose from a surge in inventory data processing demand that processing infrastructure in the NA region struggled to efficiently manage the increased load, resulting in delays in data updates within the platform. Our teams performed several infrastructure enhancements that helped increase the processing and reduce the backlog for all customers. The backlog was cleared for all customers except one . The issue was declared as restored on  March 28, 2025, 5:30 AM PDT while our support teams engaged the remaining customer directly to continue with the investigation. Once the system reached sufficient stability, we reverted the infrastructure changes, and the affected customers are no longer hosted on their individual servers.

Root Cause

 

The backlog arose from a surge in inventory data processing demand, primarily driven by several high-volume customers. The shared processing infrastructure in the NA region struggled to efficiently manage the increased load, resulting in delays in data updates within the platform. Processing queues became overwhelmed, and the current prioritization mechanisms proved inadequate to alleviate the congestion during peak load periods.

Remediation Actions

 

·        Infrastructure Segregation: The largest contributing customer was migrated to a dedicated infrastructure with additional servers configured for redundancy. This change resulted in approximately double the previous inventory processing throughput for that customer.

·        Expansion of Dedicated Resources: Dedicated infrastructure groups were provisioned for the remaining three high-volume customers to isolate their workloads from the shared environment.

·        Configuration Tuning: Processing prioritization rules were adjusted to focus resources on the top three customers with the most significant backlogs.

·        Capacity Scaling: An additional processing server was added to the largest contributor's dedicated infrastructure group after backlog growth persisted despite prior changes.

·        Product Enhancement Implementation: Our team implemented a product enhancement to improve processing times. A hotfix was deployed that has successfully resulted in significantly improved processing performance.

·        Monitoring and Validation: Ongoing monitoring confirmed a steady decline in backlog volumes across all five major impacted customers. The backlog in NA was eventually cleared for all customers except one, whose processing continued to be monitored closely.

·        Customer Communication: Support engaged with the remaining affected customer directly and provided timely updates.

Future Preventative Measures

 

·        Infrastructure Scaling Strategy: Evaluate and implement a scalable infrastructure model that allows dynamic allocation of dedicated resources to high-volume tenants.

·        Enhanced Processing Prioritization: Develop a more adaptive prioritization framework that responds to real-time backlog growth patterns and tenant processing needs.

·        Capacity Planning Improvements: Introduce proactive capacity alerts and automatic workload redistribution to avoid overloading shared infrastructure.

Posted Apr 30, 2025 - 01:08 PDT

Resolved

Inventory processing performance has been restored across the platform. Additional measures have been taken to ensure continued processing efficiency where needed.

This incident is now considered resolved, and we will continue to monitor the environment to ensure ongoing stability.
Posted Mar 28, 2025 - 09:36 PDT

Update

Inventory processing performance continues to trend in a positive direction following the changes implemented yesterday. We plan to reassess progress in the morning after the overnight imports have completed and will provide further updates as needed.
Posted Mar 27, 2025 - 21:02 PDT

Update

Inventory processing performance continues to improve, with overall progress trending in the right direction. Additional infrastructure adjustments were made earlier to support processing throughput. Monitoring remains in place as we continue working to ensure consistent performance across the platform.
Posted Mar 27, 2025 - 12:11 PDT

Monitoring

A significant portion of the inventory data backlog has now been cleared, and the platform is progressing steadily toward full resolution. Our teams continue to monitor performance and will provide further updates as we move closer to full recovery.
Posted Mar 26, 2025 - 20:24 PDT

Update

We have continued tuning infrastructure allocations to improve processing throughput for tenants with higher inventory volumes. Recent adjustments have contributed to a meaningful reduction in backlog across several environments. In some cases, additional capacity has been added to help stabilize processing rates and support further progress.

Our teams are closely monitoring platform performance and will continue to implement optimizations as needed to ensure timely inventory data updates.
Posted Mar 26, 2025 - 13:02 PDT

Update

We have made adjustments within the platform to optimize processing for tenants with higher inventory volumes. Early indicators suggest that these changes are beginning to reduce the backlog. Our teams are continuing to monitor performance closely and will evaluate further tuning as needed to ensure continued progress.
Posted Mar 25, 2025 - 22:40 PDT

Update

We are preparing additional enhancements to optimize how inventory processing is prioritized across the platform. These improvements are intended to dynamically allocate resources based on backlog size, helping to improve responsiveness for tenants with higher volumes. Implementation planning is underway, and we will continue to provide updates as progress continues.
Posted Mar 25, 2025 - 20:21 PDT

Update

We have implemented additional infrastructure changes to improve inventory processing throughput. These changes build on previous efforts and are designed to accelerate progress in reducing the existing backlog. Monitoring remains ongoing, and we continue to evaluate further adjustments as needed to ensure consistent and timely inventory data updates across the platform.
Posted Mar 25, 2025 - 15:34 PDT

Update

We have implemented infrastructure optimizations to improve inventory processing throughput. Early indicators show increased processing rates, and our teams are actively monitoring the impact of these changes. Additional measures are being evaluated to further reduce the backlog and ensure timely reflection of inventory data across the platform.
Posted Mar 25, 2025 - 14:30 PDT

Investigating

Incident Description: We are currently investigating an inventory processing backlog affecting IT Asset Management (ITAM) services in the North America region. As a result, customers may experience delays in the reflection of the most recent inventory data within the platform.

Priority: P2

Restoration Activity: Our teams are actively reviewing processing behavior and evaluating targeted remediation steps to reduce the backlog and ensure timely inventory updates. We are continuing to assess options to mitigate impact and will share further updates as progress is made.
Posted Mar 25, 2025 - 13:21 PDT
This incident affected: Flexera One - IT Asset Management - North America (IT Asset Management - US Inventory Upload).