Degraded Performance of Phrase Orchestrator (EU & US) Workflow Engine Component between October 10, 10:10 AM CEST and October 10, 01:00 PM CEST

Incident Report for Phrase

Postmortem

Root Cause Analysis

October 10, 2025

Introduction

We would like to share more details about the events that occurred with Phrase Orchestrator (EU & US) between 10:05 AM CEST and 01:00 PM CEST on October 10, 2025 which led to delayed workflow executions for non-scheduled workflows and what Phrase engineers are doing to prevent these issues from reoccurring.

Timeline

10:05 AM CEST: Our logging system showed repeated errors after introducing a change to Orchestrator’s system dependencies.

11:54 AM CEST: Through our monitoring system, Orchestrator engineers observed that workflows were not being processed by the Orchestrator workflow engine component. They began investigating and quickly identified the root cause - a recent system dependency change.

12:09 PM CEST: Our engineers prepared a rollback of the problematic change.

01:00 PM CEST: The rollback was successful. The workflow engine component began processing queued workflows again.

03:15 PM CEST: Orchestrator engineers confirmed the queue had cleared.

Root Cause

There was an update to a dependency which checks if the Workflow Engine component is under undue load. Due to a change in behavior of this dependency, the Workflow Builder component was unable to receive accurate load data from the Workflow Engine component. A, as a consequence, Orchestrator’s fail-safe mechanism activated to prevent data loss, causing no further workflows to execute in the workflow engine.

Actions to Prevent Recurrence

To prevent similar incidents in the future will improve our alerting around error logs. Additionally, we have identified further improvements for our alerting system that will notify us earlier in case of workflows not being processed. We will also speed up the process of rolling back releases in order to revert changes faster.

Posted Oct 14, 2025 - 13:57 CEST

Resolved

This incident has been resolved.
Posted Oct 10, 2025 - 15:34 CEST

Monitoring

Recovery is in progress across both EU and US regions, and the queues are currently being processed.
Customers might still experience delayed workflow executions while the backlog is cleared. We are closely monitoring the situation to ensure full stability.
Posted Oct 10, 2025 - 13:21 CEST

Investigating

Phrase Orchestrator (EU & US) has been experiencing degraded performance. The engineering team is investigating the issue. We apologize for any inconvenience caused.
Posted Oct 10, 2025 - 12:52 CEST
This incident affected: Phrase Orchestrator (EU) (Legacy Workflow Engine) and Phrase Orchestrator (US) (Legacy Workflow Engine).