Issue Discovered - Service disruption in European Region - Multiple Services

Incident Report for xMatters

Postmortem

What happened?

On May 13, 2025 at approximately 10:45 AM Pacific, xMatters internal monitoring tools identified an issue where customers in the EU region experienced intermittent web UI and API timeouts.

Why did it happen?

The issue occurred because a backend queueing service experienced network timeouts during an unpredictable rapid increase in usage. The increase in resource consumption due to the surge in network usage caused service timeouts and restarts, as well as higher latency which caused further delays in responses to backend requests.

How did we respond?

xMatters internal monitoring tools alerted the xMatters Incident Response Team to the issue, then the team launched the internal SEV-1 process. Due to early detection, Engineering teams were able to scale up the queueing services to prevent further service degradation and availability issues. The network timeouts were resolved after resources were scaled up to accommodate the increase in usage.

What are we doing to prevent it from happening again?

The Engineering teams have adjusted resources to better compensate for sudden usage increases and to prevent them from affecting backend services. The improvement in resource allocation and adaptability should prevent similar issues from occurring in the future.

Posted May 27, 2025 - 13:50 PDT

Resolved

The issue has been addressed, and all services have been restored. Thank you for your patience while we addressed this matter.

Posted May 13, 2025 - 13:46 PDT

Monitoring

The xMatters Incident Response team has deployed a fix for the issue. We are currently monitoring the situation to ensure the implementation is stable and that all services are restored.

Posted May 13, 2025 - 11:03 PDT

Update

We are continuing to work on a fix for this issue.

Posted May 13, 2025 - 10:54 PDT

Identified

The xMatters Incident Response team has identified the source of the issue and is working on a fix. We will update once a solution has been identified and implemented.

Posted May 13, 2025 - 10:53 PDT

Investigating

xMatters monitoring tools have identified a potential issue with xMatters On-Demand for some clients located in the Europe region. We are currently investigating the issue and will update as information becomes available.

Please see incident details for specific services impacted.

If you are also experiencing issues, or if you're not sure whether this issue impacts your service, please contact xMatters Client Assistance at https://support.xmatters.com/hc/en-us/requests/new - our support agents are waiting to help.

Posted May 13, 2025 - 10:45 PDT

This incident affected: Europe, Middle East, and Africa (Web Interface, Email Notifications, SMS Notifications, Voice Notifications, Conferencing, Integration Platform, API, Mobile App).