What happened?
On Thursday, February 22, 2018, at approximately 9:50 AM PST, the xMatters monitoring systems alerted the Client Assistance team to an issue with the xMatters On-Demand service for some clients located in North America. Some users may have experienced delays in notification delivery after injecting an event into xMatters via the Integration Builder.
Why did it happen?
This issue was caused by a previously unknown defect within an Integration Builder service that prevented the service from automatically reconnecting after a related component in the data center was restarted. Without the connection, the service failed to process new events and left them in a backlog.
How did we respond?
After identifying an issue with Integration Builder event creation, the xMatters Client Assistance and Operations teams initiated the internal Severity-1 process. The incident response teams began simultaneously investigating the underlying cause and working to restore services for clients. The team quickly identified a failed back-end service that was causing some clients to experience delays in notifications when injecting an event into xMatters via the Integration Builder. The incident team was able to isolate and identify the component that caused the issue, and implemented a solution. Once the solution was implemented, notification delivery was back to normal thresholds, and clients confirmed that all services had been restored.
What are we doing to prevent it from happening again?
The xMatters Engineering team has identified and isolated the defect within the Integration Builder and is currently developing and implementing a permanent solution. (BUG-11673 - In Progress)
Description
2018-02-22 09:52 - xMatters monitoring tools alert the Client Assistance team to an issue with On-Demand services in the North American region
2018-02-22 10:40 - Client reports an issue with injecting events via the Integration Builder
2018-02-22 11:05 - Internal Severity-1 process initiated
2018-02-22 11:08 - Bulletin posted to xMatters status page: http://status.xmatters.com/incidents/ytyrn1gk1pb3
2018-02-22 11:20 - Issue is isolated to a specific data center Integration Builder component
2018-02-22 11:33 - Solution is implemented
2018-02-22 11:35 - Services are restored If you have any questions, please visit http://support.xmatters.com