We would like to share details about an incident that affected the availability of the RawMT UI portal between 4:15 PM and 6:03 PM UTC on January 15, 2026. During this time, users were unable to access the portal. The disruption was caused by a capacity limit being exceeded in our routing infrastructure. This post-mortem outlines what occurred and the steps we are taking to prevent similar issues in the future.
Jan 15, 2026 @ 4:15 PM UTC – A high-severity alert was triggered as the RawMT UI portal became inaccessible to users.
Jan 15, 2026 @ 4:20–5:00 PM UTC – Engineering teams identified that the issue was related to the routing layer reaching a fixed platform limit, preventing service traffic from being properly directed.
Jan 15, 2026 @ 5:03 PM UTC – A mitigation was applied, re-routing affected services through alternate infrastructure.
Jan 15, 2026 @ 6:03 PM UTC – Service was confirmed fully restored and the incident was marked as stable.
The issue occurred when the platform’s traffic routing infrastructure reached a fixed capacity limit on the number of active service associations it could manage. Once this limit was exceeded, configuration updates failed, preventing traffic from being correctly routed to certain services.
Although application services remained operational, they became unreachable to users because there were no valid routing paths available. This resulted in a complete outage of the RawMT UI portal.