Platform: US - Slow pages with intermittent HTTP failures

Incident Report for Delinea

Postmortem

Incident Overview

On July 18, 2025, between 12:40 PM and 2:37 PM CT, customers in the US region experienced degraded Platform performance, including slow-loading pages and intermittent HTTP failures. The issue was confined to the US region, with services in other regions operating normally.

The degradation was caused by CPU resource starvation, which impacted some Platform services and contributed to the observed latency and availability issues.

Root Cause

A background service responsible for processing audit session data unexpectedly scaled up and consumed a disproportionate share of cluster resources. This led to resource contention within the cluster, affecting the performance of other services that rely on shared capacity. As a result, some Platform services experienced slow responses and intermittent failures.

Resolution

The issue was mitigated by reducing the audit workload’s resources, which released significant CPU capacity back to the cluster. Service performance improved shortly afterward and returned to normal by 2:37 PM CT.

Preventive Actions

To prevent recurrence, the following actions are in progress:

  • The workload’s resource limits will be reviewed and adjusted to meet operational needs.
  • Resource quotas are being tested to ensure that no single workload can consume excessive resources, safeguarding overall cluster health.
Posted Jul 29, 2025 - 12:28 EDT

Resolved

This incident has been resolved.
Posted Jul 18, 2025 - 15:37 EDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jul 18, 2025 - 14:58 EDT

Investigating

A subset of Platform customers in US region are experiencing elevated latency and intermittent HTTP failures. Customers in other regions are not impacted. We are investigating this issue and will share the next update in an hour or as events warrant.
If you have any questions or require further assistance, please contact our support team at https://support.delinea.com.
Posted Jul 18, 2025 - 14:47 EDT
This incident affected: US (Platform).