Cloud Management Platform (CMP) - NA - Self-Service - Shard 4 - Potential Errors Launching New Cloud Apps with Plugins
Incident Report for Flexera System Status Dashboard
Postmortem

Description: Cloud Management Platform (CMP) - NA - Self-Service - Shard 4 - Potential Errors Launching New Cloud Apps with Plugins

Timeframe:  9 September 2024, 10:30 AM PDT to 9 September 2024, 12:24 PM PDT

Incident Summary

On Monday, 9th September 2024, 10:30 AM PDT, following a scheduled weekend maintenance release, our team identified an issue impacting customers using the Cloud Management Platform (CMP) on Self-Service Shard 4. Customers attempting to launch new Cloud Apps that utilized Plugins encountered errors. This issue was isolated to Shard 4, and was caused by a feature update deployed during the maintenance window.

To prevent further disruptions, an emergency rollback was initiated immediately after the issue was identified. During the rollback process, some customers experienced brief slowness and delays in launching Cloud Apps and performing other operations. The rollback was successfully completed by 12:24 PM PDT on 9 September 2024, restoring full functionality to Shard 4.

Root Cause

 

The issue occurred due to an update applied during the scheduled maintenance window. This update introduced a feature that affected the functionality of Plugins when customers attempted to launch new Cloud Apps. The change has been rolled back and the erroneous code has been reviewed and fixed.

Remediation Actions 

·        Emergency Rollback: Upon identification of the issue, the team immediately implemented a rollback of the changes deployed during the scheduled maintenance.

·        Restoration of Services: After the rollback was completed, normal operations on Shard 4 were restored, and customers could launch Cloud Apps with Plugins without encountering errors.

Future Preventative Measures 

 Pre-release Testing Improvements:

  • Enhance testing protocols for systems operating on legacy platforms, ensuring new features are tested for compatibility before deployment.
  • Expand regression testing to cover more scenarios specific to Shard 4.

Compatibility Checks:

  • Implement a compatibility validation process as part of the deployment. This will ensure that any new features or updates are validated against all shards before release.

Monitoring Enhancements:

  • Increase monitoring capabilities, specifically focusing on plugin functionality and key operations such as Cloud App launches. This will enable faster detection of issues, allowing the team to respond before customers are significantly impacted.
Posted Sep 20, 2024 - 05:03 PDT

Resolved
This incident has been resolved.
Posted Sep 10, 2024 - 12:25 PDT
Update
The emergency rollback has been successfully completed, and normal operations have been fully restored. This incident is now resolved, and the platform is functioning as expected.
Posted Sep 10, 2024 - 12:25 PDT
Update
We are continuing to work on a fix for this issue.
Posted Sep 10, 2024 - 11:22 PDT
Identified
Incident Description: Following the scheduled release over the weekend, our team identified a potential issue that could impact customers using the Cloud Management Platform (CMP) on Self-Service Shard 4. As a result, customers may experience errors when launching new Cloud Apps with Plugins.

Priority: P2

Restoration Activity: In response, our team is taking immediate action by implementing an emergency rollback to prevent any further disruptions. While this change is underway, customers may notice brief periods of slowness and temporary delays in launching Cloud Apps and other operations. Our priority is to ensure a smooth experience and minimize any impact, and we are closely monitoring the situation to restore full service as quickly as possible.
Posted Sep 10, 2024 - 11:08 PDT
This incident affected: Legacy Cloud Management (Self-Service - Shard 4).