Degraded Performance of Phrase TMS (EU) Term Base Component

Incident Report for Phrase

Postmortem

Introduction

We would like to share more details about a performance degradation that affected the Term Base service between December 9 and December 15, 2025. During this period, customers experienced issues where the Term Base service was intermittently unavailable or significantly delayed in processing requests. This disruption impacted users relying on Term Bases across various workflows. The engineering team has resolved the issue and is implementing further improvements to prevent similar events in the future.

Timeline

Dec 9, 2025 @ 04:10 CET – Term Base service began experiencing intermittent unavailability, with requests failing or timing out. On-call engineers were alerted and began investigation.

Dec 10, 2025 @ 08:30 CET to Dec 10, 2025 @ 11:40 CET – Degraded performance was observed with the Term Base service. The team initially suspected an internal processing job in the database, which was paused. During this time, database load was found to be elevated and the database was scaled out to support the investigation.

Dec 11, 2025 @ 05:20 CET to Dec 11, 2025 @ 07:40 CET – Partial outage occurred, during which approximately 350 Term Base API requests were not processed. The team identified that a specific pattern of search requests for large Term Bases was responsible for the disruption. A mitigation was deployed to timeout slow requests while work on a permanent solution continued.

Dec 12, 2025 @ 15:45 CET – A fix was validated in the test environment and deployed to production.

Dec 15, 2025 @ 19:47 CET – After continued monitoring confirmed stability, the incident was officially marked as resolved.

Root Cause

The performance degradation was caused by a combination of factors:

  • A very high volume of incoming requests targeting the Term Base Search API.
  • These requests included unusually long search texts and targeted large Term Bases.
  • The resulting processing demands led to resource exhaustion on backend servers, affecting availability and response times.

These request patterns overwhelmed the system’s ability to respond efficiently, leading to cascading delays and failed API calls during peak periods.

Actions to Prevent Recurrence

  • The logic for handling complex Term Base search requests has been optimized to reduce processing load and improve performance.
  • The engineering team is working on architectural improvements to isolate the impact of expensive requests and improve overall system resilience.
  • Monitoring and alerting thresholds for high-frequency, high-cost request patterns are being refined to allow earlier detection and mitigation.
Posted Dec 19, 2025 - 16:29 CET

Resolved

Performance has stabilized following the deployment of the fix, and the incident has been resolved.
Posted Dec 15, 2025 - 11:06 CET

Monitoring

The fix has been deployed, and system metrics will continue to be observed.
Posted Dec 12, 2025 - 15:18 CET

Update

The fix is still in progress.
Posted Dec 12, 2025 - 09:38 CET

Update

Testing and preparation of the mitigation fix for release are still underway.
Posted Dec 11, 2025 - 17:28 CET

Update

A fix is still under active development by our engineering team.
Posted Dec 11, 2025 - 10:50 CET

Identified

The root cause of the issue has been identified, and work on a fix is currently underway.
Posted Dec 10, 2025 - 17:21 CET

Investigating

Some users might still be experiencing issues with term bases in Phrase TMS.
Posted Dec 10, 2025 - 12:42 CET
This incident affected: Phrase TMS (EU) (Term base).