Delinea Cloud Suite Pod34 Service Disruption

Incident Report for Delinea

Postmortem

Incident Overview

On August 7, 2025, customers hosted on Pod 34 experienced a service outage. The pod became intermittently unresponsive, leading to delays and request failures for some customers. Other pods and services were unaffected. Service availability was restored by 9:10 AM EDT on August 7, 2025.

Timeline of Events

  • 04:19 AM EDT – Operations team received alerts of degraded performance on Pod 34; on-call engineer began investigation.
  • 04:43 AM EDT – Initiated rolling reimage of Pod 34 web servers.
  • 07:55 AM EDT – Restarted Pod 34 database instance.
  • 09:10 AM EDT – Pod 34 stabilized and service availability returned to normal levels.

Root Cause

The disruption occurred when the database on Pod 34 reached its maximum allowed connections, preventing new connections and causing request timeouts. A high volume of requests from one tenant triggered this condition, resulting in performance degradation for other tenants on the same pod (a “noisy neighbor” scenario).

Resolution

Active database connections holding transaction locks were terminated, allowing new connections to be established and restoring normal service operation.

Preventive Actions

  • Delinea’s team is engaging directly with the tenant whose activity triggered the high connection volume.
  • We are working on migrating select tenants on Pod 34 to dedicated pods to reduce resource contention.
  • We are testing two permanent mitigation strategies:

    • Server-side: Cache negative UID lookups to reduce the need for new database connections.
    • Agent-side: Debounce redundant and concurrent lookup attempts.
Posted Aug 25, 2025 - 23:59 EDT

Resolved

This incident has been resolved.
Posted Aug 07, 2025 - 11:08 EDT

Monitoring

We applied the mitigation steps to restore the service availability and we are monitoring the results.
Posted Aug 07, 2025 - 09:32 EDT

Update

We are continuing to investigate this issue.
Posted Aug 07, 2025 - 08:54 EDT

Update

We are continuing to investigate this issue.
Posted Aug 07, 2025 - 08:54 EDT

Investigating

We are currently experiencing a service disruption on Cloud Suite Pod34.

We are seeing elevated connection counts and locking occuring on the database. As a result, we are going restart the database within the next few minutes. This will result in a brief complete outage.

We apologize for any inconvenience this may cause and appreciate your patience as we work to restore normal service.

For any questions or concerns, please reach out to our support team at https://support.delinea.com.
Posted Aug 07, 2025 - 08:39 EDT
This incident affected: US (Privileged Access Service / Cloud Suite).