On October 3, 2018 at approximately 3:05 AM PDT, the xMatters monitoring systems alerted Client Assistance to a potential issue with one of the data centers located in North America. No customers reported any issues, though it is possible that some users may have experienced a very brief interruption in attempting to access the On-Demand web user interface. No alerts or events were lost during this incident, and all notifications were delivered promptly.
This issue was caused by a connectivity problem with the Internet service provider for one of our North American data centers. The connection issue occurred beyond the xMatters environments, outside our firewalls.
As soon as the monitoring tools alerted Client Assistance to an issue, they immediately began checking client environments for connection issues. The monitoring tools continued to show fluctuations in connectivity, though initial checks showed client environments that were initially reported down recovering within one minute. Client Assistance initiated the major incident management process and engaged the Operations and Engineering teams to assist in identifying any possible issues. The incident response teams isolated the fluctuations as occurring beyond the xMatters firewalls and identified the root cause as an issue with the Internet provider for the data center. Within minutes of the initial alarm, the Internet connection stabilized, and the teams confirmed that all services were operating normally.
Although the xMatters monitoring tools indicated intermittent connectivity between 3:04 and 3:11 AM, the Internet service provider could not confirm the issue, reporting that they had not received any reports of maintenance or outages on their network at that time. While it is difficult if not impossible to predict connection issues with Internet service providers, we are taking steps to resolve these types of problems via our hosting service improvements described here: https://support.xmatters.com/hc/en-us/articles/115005269506-Improving-our-hosting-services
The robustness of this new infrastructure should help avoid similar issues by reducing dependence on any individual service provider. In the short term, we will continue to work with our existing carrier to identify ways to prevent customer impact should a similar issue occur in the future.
October 3, 2018 3:05 AM xMatters monitoring tools alert Client Assistance to a potential issue with client environments being down
3:14 AM Major incident management process initiated, incident response teams begin investigation
3:16 AM Root cause identified as connectivity fluctuations that have since ceased; all customer environments reported up
3:22 AM All services confirmed restored
If you have any questions, please visit http://support.xmatters.com