At 2:56pm (PST) on Tuesday, January 8, 2018, some clients within the Asia-Pacific region reported an issue to xMatters Client Assistance where some of their users were not able to access the xMatters mobile app on either the iOS or Android platforms.
Why did it happen?
This issue occurred when the mobile API services were unable to communicate with certain xMatters components due to the re-use of network addresses which were not configured appropriately. This caused firewalls to reject traffic from the mobile API services.
How did we respond?
As soon as the xMatters Client Assistance team was alerted to a potential issue with the mobile app, they began to investigate the root cause. Once the team replicated the issue, they immediately initiated the internal Severity-1 process and engaged the incident response teams to begin simultaneously investigating the underlying cause and working to restore services for clients. The Client Assistance team posted a bulletin to the Support site informing clients about the issue and updated it throughout the incident. During the investigation, the Operations team identified the underlying issue as a networking problem caused by the re-use of certain network addresses that were not cleaned up after previous software decommissioning, and immediately applied a fix to the configuration. The Client Assistance team then confirmed that all services had been restored and were functioning as expected.
What are we doing to prevent it from happening again?
To prevent this issue from happening again, xMatters will conduct a complete audit of the configuration rules to ensure that they are accurate and up-to-date. The audit will also be added to the software decommissioning process to ensure it is completed on a regular, ongoing basis. (Currently in progress; xMatters internal reference: COREL-5566 - Deprecated NAT Rules Still Exist)
2018-01-08 02:56 - xMatters alerted that some clients cannot access the mobile app on iOS and Android
2018-01-08 02:58 - Client Assistance begins troubleshooting the issue
2018-01-08 03:10 - Severity-1 process initiated
2018-01-08 03:12 - Support Bulletin posted: http://status.xmatters.com/incidents/d1hgv64fqbkw
2018-01-08 03:13 - Incident response teams formed; investigation continues with additional resources
2018-01-08 03:38 - Operations team identifies the issue and begins working on a solution
2018-01-08 03:58 - Operations team applies the fix
2018-01-08 04:02 - Services confirmed as restored If you have any questions, please visit http://support.xmatters.com