A subset of Platform customers in US region experienced issues logging into their tenant. This included timeouts or login failures with a 504 error. Customers in other regions were not affected.
Start of Impact: May 13, 2025, 7:50 AM ET
End of Impact: May 13, 2025, 8:23 AM ET
A routine update meant to improve backend systems unexpectedly caused a slowdown in one of our login-related caching services. During the update, a process designed to refresh user permissions ran slower than expected, putting extra strain on our caching service. This in turn delayed responses to other services that rely on it. As a result, some users experienced login timeouts or failures during this period.
To prevent a recurrence of this issue, we are taking the following actions:
Improving how we manage system resources during updates to avoid similar disruptions.
Introduce test cases specifically designed to address this edge case, allowing us to identify potential issues before deployment.
Enhancing our monitoring to detect login problems sooner and respond more quickly.