We would like to share more details about the events that occurred on January 27, 2026, between approximately 10:00 AM UTC and 6:30 PM CET, which led to a performance disruption of the Phrase TMS (EU) Project Management component.
During this time, some users experienced slow or unresponsive behavior when creating or editing projects. The issue was caused by limitations in the underlying database connections used by the LQA service, which is internally used by the Project Management component.
Jan 27, 2026 @ ~10:00 AM
First customer reports indicate slow or unresponsive project creation and editing in the EU region.
10:54 AM
Automated alert triggered due to a high number of active backend sessions in the EU production environment.
11:00–11:30 AM
Engineering investigation begins. Logs indicate that the LQA service is unable to obtain database connections, with repeated timeouts when requesting connections from the application’s connection pool.
11:40 PM
Application logs show “Too many connections” errors from the underlying database. Some service instances restart as a result.
12:00–3:00 PM
Initial mitigation steps taken:
The issue improves but intermittent errors persist.
3:30 PM
Decision made to scale the underlying production database instance to a larger size and adjust connection limits accordingly.
~5:30 PM
Database scaling completed. Error rates drop and no further “Too many connections” errors are observed. System performance stabilizes.
Incident marked as resolved after continued monitoring confirmed stable behavior.
The disruption was caused by exhaustion of available database connections in the LQA service’s underlying production database.
The LQA service uses a connection pool to communicate with the database. Under increased load, the configured connection limits on both the application side and the database side were insufficient. As more requests were processed, all available database connections were consumed. Once the limit was reached:
Although the database itself was operational, the maximum number of allowed concurrent connections was too low for the actual usage patterns in production. Additionally, the size of the database instance limited how many connections could be supported safely.
This combination led to a bottleneck in the LQA service, which in turn affected the Project Management component in the EU region.
To reduce the likelihood of similar incidents in the future, we are implementing the following measures:
* Active connections
* Slow queries
* Resource utilization