Description: Flexera One - Technology Intelligence Platform/IT Visibility - EU Region – Service Disruption
Timeframe: February 6, 2025, 4:15 AM PST – February 6, 2025, 6:15 AM PST
Incident Summary
On February 6, 2025, at 4:15 AM PST, we experienced a service disruption affecting Flexera One IT Visibility Dashboards, Reports, and Export APIs (including Technology Intelligence) in the EU region. While the platform remained accessible, customers may have encountered difficulties accessing dashboards, generating reports, using API services, and Power BI out-of-the-box (OOTB) reports.
The issue was traced to a failure in connecting to an external service provider’s infrastructure, which led to initialization failures for critical services. As new nodes were deployed, they were unable to establish a connection to the provider’s services due to an issue on their end. This disruption prevented services from coming online, leading to intermittent unavailability of several key services.
Upon detecting the issue at 4:32 AM PST, our technical team swiftly responded by rotating the affected nodes and manually intervening to restore the impacted services. This action intermittently restored service at 4:56 AM PST. At 5:23 AM PST, the service provider confirmed they had identified the issue on their end, related to difficulties in managing their infrastructure.
By 6:00 AM PST, our technical team took direct action to alter the service configuration for IT Visibility, scaling out a number of services to prevent further requests to the external service handling proxy and optimize resource usage. At 6:15 AM PST, core services, including critical components, were manually scaled in to restore full functionality, effectively ending the customer-impacting element of the incident.
Shortly after, the service provider confirmed full resolution, ensuring error rates returned to normal after deploying a configuration change to address the connectivity issue.
Root Cause
Primary Root Cause:
External Service Provider Outage: The issue arose due to a problem at the service provider's end, which prevented successful connections to their infrastructure. This disruption impacted the initialization of critical services required for normal platform operation.
Contributing Factors:
Remediation Actions
Future Preventative Measures