Flexera One – SaaS Manager – NA – Managed SaaS Applications Not Loading
Incident Report for Flexera System Status Dashboard
Postmortem

Description: Flexera One – SaaS Manager – NA – Managed Applications Not Loading

Timeframe: November 4th, 2:38 AM PDT to November 4th, 9:54 AM PDT

Incident Summary:

On Friday, November 4th, at 2:38 AM PDT, we received reports of performance degradation in the SaaS Manager application in the NA region. Customers were able to access the Managed SaaS Applications menu, but when they attempted to launch any application from the menu, it stayed stuck on loading and did not load the results.

After further investigation, technical staff observed some resource contention issues. To alleviate the issue, one of the web app servers was rebooted at 4:37 AM PDT. It provided some relief, however, some of the services were still in an unhealthy state causing performance issues in the application.

After further investigation, technical staff found that one of the worker nodes was in an unhealthy state. Any requests going into pods hosted on this node resulted in failure. At 9:54 AM PDT, the impacted node was drained, and pods were moved to healthy nodes. Health checks confirmed that impacted services were now restored and functional.

Staff continued to monitor the services for the next few hours to ensure stability. After further health checks and monitoring, the incident was declared resolved.

Root Cause:

Investigation revealed that one of the worker nodes was in an unhealthy state. Any requests going into pods hosted on this node resulted in failure.

Corrective Action:

  1. To remediate the issue, the impacted node was drained, and pods were moved to healthy nodes
  2. As a long-term solution, technical staff has resized the nodes to larger instances
  3. We have also applied a CPU memory limit on the pods to prevent some from starving others of CPU
  4. We now have more nodes running in the environment due to autoscaling
Posted Dec 22, 2022 - 09:42 PST

Resolved
This incident has been resolved.
Posted Nov 04, 2022 - 11:32 PDT
Monitoring
Technical staff found that one of the nodes was in an unhealthy state. Any requests going into pods hosted on this node were resulting in a failure. To remediate the issue, the impacted node was drained, and pods were moved to working nodes. Health checks have been successful, and pages are loading without any errors or slow response times. We will continue to monitor for the next hour.
Posted Nov 04, 2022 - 10:24 PDT
Update
We are continuing to investigate the issue.
Posted Nov 04, 2022 - 08:49 PDT
Update
The investigation is still in progress. Technical teams are still observing intermittent performance issues. We have engaged additional SMEs to assist with the investigation.
Posted Nov 04, 2022 - 06:44 PDT
Update
Technical staff rebooted one of the App Servers, and Managed applications are accessible now. We are monitoring further.
Posted Nov 04, 2022 - 05:27 PDT
Investigating
Incident Description: We are currently experiencing issues with the SaaS Manager application in the NA region. Customers can access the Managed SaaS Applications menu, but when they attempt to launch any application from the menu, it stays stuck on loading and does not load the results.

Priority: 2

Restoration Activity: Technical teams have been engaged and are investigating.
Posted Nov 04, 2022 - 04:19 PDT
This incident affected: Flexera One - IT Asset Management - North America (IT Asset Management - US SaaS Manager).