October 10, 2025
We would like to share more details about the events that occurred with Phrase Orchestrator (EU & US) between 10:05 AM CEST and 01:00 PM CEST on October 10, 2025 which led to delayed workflow executions for non-scheduled workflows and what Phrase engineers are doing to prevent these issues from reoccurring.
10:05 AM CEST: Our logging system showed repeated errors after introducing a change to Orchestrator’s system dependencies.
11:54 AM CEST: Through our monitoring system, Orchestrator engineers observed that workflows were not being processed by the Orchestrator workflow engine component. They began investigating and quickly identified the root cause - a recent system dependency change.
12:09 PM CEST: Our engineers prepared a rollback of the problematic change.
01:00 PM CEST: The rollback was successful. The workflow engine component began processing queued workflows again.
03:15 PM CEST: Orchestrator engineers confirmed the queue had cleared.
There was an update to a dependency which checks if the Workflow Engine component is under undue load. Due to a change in behavior of this dependency, the Workflow Builder component was unable to receive accurate load data from the Workflow Engine component. A, as a consequence, Orchestrator’s fail-safe mechanism activated to prevent data loss, causing no further workflows to execute in the workflow engine.
To prevent similar incidents in the future will improve our alerting around error logs. Additionally, we have identified further improvements for our alerting system that will notify us earlier in case of workflows not being processed. We will also speed up the process of rolling back releases in order to revert changes faster.