Description: Flexera One - IT Visibility - NAM - Inventory Data Out of date
Timeframe: July 22, 2025, 1:05 AM PDT to July 26, 2025, 5:27 PM PDT
Incident Summary
On Tuesday, July 22nd, 2025, at 1:09 AM PDT, an issue was detected that resulted in delays in the processing of inventory data for IT Visibility services in the North America region. Although the Flexera One platform and IT Visibility remained accessible, the uploads of inventory data did not process as anticipated, leading to a lack of updated inventory information within the system. This situation resulted in a temporary lack of visibility into the most recent inventory data for customers.
Despite the upload jobs appearing to be successful, they either timed out or were recorded as empty. As a result, the data displayed in the user interface and reports was outdated, reflecting a delay ranging from approximately 30 minutes to several hours, depending on the time of access. Furthermore, Technopedia services may have also experienced impacts during this disruption. It is noteworthy that no customer reports were received during the incident window.
In response, our teams executed a full data reload and restarted the inventory processing service to initiate recovery efforts. By July 22, 2025, at 12:36 PM PDT, the affected services were restored, and our teams began addressing the backlog that had accumulated during the outage. Our teams continued to monitor the backlog closely and officially declared the issue resolved after the backlog was fully processed on July 26, 2025, at 5:27 PM PDT.
Root Cause
Investigations by our technical teams revealed that the disruption was caused by automated patch updates from our cloud service provider, which unexpectedly impacted core components responsible for inventory data ingestion and other critical services. These services are essential for processing uploaded inventory data into the platform.
The automated patches led to unexpected service behaviour, where data ingestion services did not fail visibly but caused silent interruptions in downstream processing.
Remediation Actions
· System Recovery Initiated: Our teams performed a full data reload and restarted the inventory processing service to begin recovery efforts.
· Core Services Restored: The key services responsible for handling inventory data were successfully brought back online, allowing delayed data to resume processing.
· Data Accuracy Ensured: To ensure no loss of critical information, our teams initiated a replay of agent data, covering the duration of the disruption.
· Progressive Backlog Reduction: Over the following days, our teams closely monitored processing progress and worked to reduce the backlog from two days to less than ten hours.
· Full Restoration Achieved: As of July 27, 2025, 12:59 PM PDT, processing had fully caught up, and normal operations resumed. The incident was closed after confirming all systems were stable and up to date.
Future Preventative Measures
· Enhanced Monitoring: Deploy additional monitoring metrics and alerts specifically for normalization steps.
· Automated Health Checks: Introduce periodic end-to-end health checks of the service to proactively detect processing stalls.
· Resilience Improvements: Explore service-level redundancy and retry mechanisms to allow ingestion and other critical services to recover autonomously from transient patch-induced disruptions.