Flexera One - IT Visibility - NAM- Error in loading PowerBI reports
Incident Report for Flexera System Status Dashboard
Postmortem

Description: Flexera One - IT Visibility - NAM- Error in loading PowerBI reports

Timeframe:  October 29, 2024, at 12:30 PM PDTto October 30, 2024, 12:32AM PDT

Incident Summary

On Tuesday, October 29th, 2024, at 12:30 PM PDT, we detected a service interruption that may have prevented PowerBI reports from loading in Flexera One UI for all ITV customers on TI Platform in the NAM region, the affected users received an error stating that compute capacity limits had been exceeded. This disruption prevented the affected customers from accessing essential reports and real-time insights. Our technical teams team detected the issue and promptly started their investigation. They quickly confirmed that only ITV customers in the NAM region were affected. Upon investigation, it was found that PowerBI capacity utilization had consistently reached or exceeded the maximum limit during peak usage times, leading to blocked report access. To address the overload, the team initiated an upgrade to the PowerBI instance size in the NAM region, effectively increasing capacity to support higher demand. After scaling up, testing confirmed that customers could load PowerBI reports again, and the issue was fully resolved by 12:32 AM PDT on October 30.

Root Cause

 

Upon investigating the PowerBI metrics for the NAM region, we identified a significant spike in capacity utilization, with occasional surges exceeding normal limits. Further analysis revealed that a recent increase in the number of users utilizing the platform added new datasets and users accessing reports. This growth placed additional demands on our current capacity, ultimately leading to an overload and causing the outage.

Remediation Actions

 

·        Scaled Up Instance Size: Increased the PowerBI instance size in the NAM region to accommodate higher demand, effectively reducing the likelihood of hitting capacity limits.

·        Continuous Monitoring: After the scale-up, our technical team actively monitored the PowerBI metrics to detect any potential disruptions promptly.

·        Thorough Testing: Conducted comprehensive testing to ensure that the resolution was successful, stable, and did not introduce new issues.

Future Preventative Measures 

 

·        Remove Unused Datasets: A ticket has been created to remove unused datasets from PowerBI, helping to free up capacity.

·        Auto-Scale Capacity: Discussions are scheduled to explore implementing an auto-scaling feature for PowerBI capacity to adjust based on load demands dynamically.

·        Incremental Dataset Refresh: Long-term improvements include evaluating the Incremental Dataset Refresh feature, which could replace the current full dataset refresh, helping to optimize capacity usage.

·        Enhanced Monitoring Process: A new, more rigorous process has been established to monitor PowerBI metrics closely to address potential capacity issues preemptively.

Posted Nov 14, 2024 - 02:25 PST

Resolved
Our teams have increased the capacity, which has resolved the issue. The service is now fully restored.
Posted Oct 30, 2024 - 00:43 PDT
Update
We are actively investigating this issue. In the meantime, our team is working to increase capacity to help mitigate the impact on our customers.
Posted Oct 29, 2024 - 22:51 PDT
Investigating
Incident Description: Our teams detected a service disruption impacting IT Visibility Platform , which might be affecting some customers in the NAM region. As a result, the impacted customers may encounter an error message when attempting to load PowerBI reports.

Priority: P2


Restoration activity:
Our technical teams are actively investigating this issue. They are working to restore the services and identify the underlying root cause .
Posted Oct 29, 2024 - 21:45 PDT
This incident affected: Flexera One - IT Visibility - North America (IT Visibility US).