On August 5th, 2025, at approximately 7:53 PM Pacific, the xMatters internal monitoring tools alerted the xMatters Support team to an issue in the North America region with notification delivery. Some xMatters customers may have experienced delays when receiving notifications or noticed failed alerts in the system.
The issue occurred because of a network problem that disrupted communication between parts of the queuing system. This caused some components to become temporarily out of sync, leading to timeouts, internal connectivity failures, and a small number of messages not being processed during the disruption. As the system recovered, some performance was degraded until normal operation could be restored.
As soon as the xMatters monitoring tools reported an issue, the xMatters Support Team initiated the internal Major Incident Management process and engaged the Engineering and incident response teams. The teams quickly mitigated the issue and restored performance by redirecting traffic from our message queuing system in the affected region (us-central) to our message queuing system in another local region (us-east). After the issue was resolved, the teams directed traffic back to the restored region.
The Engineering team has determined that the best approach to prevent this issue from reoccurring is to replace the current message queuing system. Work on the replacement system is well underway and the teams will retire the current system as soon as they have finished the replacement.