What happened?
On October 10th, 2025, at approximately 4:14 PM Pacific, the xMatters internal monitoring tools alerted Customer Support to service degradation related to an issue that was already being internally monitored with the Integration Platform in the North American region. While this issue was being investigated and mitigated, customers may have experienced intermittent request failures and occasional slowdowns.
Why did it happen?
These particular issues were caused by a security update that caused conflicts with the underlying runtime environment, specifically the memory management routine. Slow performance of the routine was triggering frequent health checks for a request processing service and causing automatic restarts.
How did we respond?
As soon as the internal monitoring tools alerted Engineering to a potential issue, they began performing manual rolling restarts and deployed more forgiving liveness checks to avoid increasing error rates and to minimize any potential impact to customers. They also increased resources for the impacted service to improve responsiveness of the underlying routine and deployed a configuration fix to the Http Client Cache to try and stabilize the system. While mitigating the potential impact, they also increased monitoring levels as they continued to investigate the root cause and were able to deploy an update to the service and environment configuration on October 13 that resolved the issue. They continued monitoring and confirmed that the system was stable and all services were operational.
What are we doing to prevent it from happening again?
Engineering has updated the backend service and deployed additional updates and monitoring to the system configuration that will improve overall stability for the environment and prevent this issue from reoccurring.