Engineering the Delinea SaaS platform for near-perfect uptime

Tony Goulding
Every second you can’t access your critical systems can equate to lost revenue, diminished customer trust, and cybersecurity risks.
Achieving a cloud platform uptime of 99.995% isn't merely about setting ambitious goals. It requires architectural decisions and operational practices that make such reliability possible.
Because we know how important uptime is to our customers, we’ve engineered the Delinea Platform to meet these demanding criteria. We believe a true cloud-native, global-scale application must be bulletproof for mission-critical workloads.
What's at stake?
For security leaders, high uptime means one less worry: your critical identity security controls remain up even during cyber crises or peak demand, reducing risk windows. IT teams won’t be paged at midnight for system outages, freeing them to focus on strategic projects instead of recovery efforts. For CIOs and business stakeholders, consistent uptime helps ensure that your revenue-generating services and users’ productivity are not interrupted by platform downtime.
Brief downtime can lead to significant financial losses and reputational damage
Even brief downtime can lead to significant financial losses and reputational damage. If your customers can’t count on your online services, they may choose to take their business elsewhere. If your employees or partners grow frustrated with the availability of critical systems, they may take their skills elsewhere, leaving you unable to run your business effectively.
High-profile service failures in recent years highlight the critical importance of robust uptime commitments for IT infrastructure. For example, early in 2025, a large cloud data center experienced significant disruption due to a network configuration error within a zone. This incident affected many businesses, leading to operational disruptions and financial losses.
Importantly, not all downtime originates from infrastructure outages. Failures at the identity and access layer can be equally disruptive.
For example, Tesla experienced an insider threat when a disgruntled employee used privileged access to make unauthorized changes to the company's manufacturing systems. The result was delays, operational chaos, and financial loss.
In addition, Singapore's SingHealth suffered a breach that exposed the personal data of 1.5 million patients. Investigations pointed to delayed patching and weak access controls as contributing factors. These events highlight the risk of insufficient oversight over privileged access, particularly when sensitive systems are exposed.
Given the sensitive nature of identity and access services, Service Level Agreements (SLAs) regarding uptime are critical.
Other vendors in this space offer 99.95% or 99.99% uptime, but these are not necessarily contractual SLAs. Even if they are contractual agreements, many vendors exclude scheduled maintenance or planned downtime for upgrades or customer-side misconfigurations. In practice, your service could be offline for over 170 minutes per year and still be considered "within SLA."
We believe that’s not good enough.
Delinea is setting a new bar for cloud-native identity security. It is committed to achieving and maintaining 99.995% uptime for the Delinea Platform, equivalent to ~26 minutes of allowable downtime annually. That SLA includes situations like cloud provider outages and activities like upgrades and patching, which happen in Delinea without scheduled maintenance. It’s nearly on par with the highest Tier IV data center standards (fully fault-tolerant systems).
Even small increases in SLA percentages drastically reduce allowable downtime
Uptime SLA | Monthly Downtime (30d) | Yearly Downtime (365d) |
99.95% | 21.60 minutes | 262.80 minutes (4h 22.8m) |
99.99% | 4.32 minutes | 52.56 minutes |
99.995% | 2.16 minutes | 26.28 minutes |
How does Delinea achieve 99.995% uptime for Delinea Platform?
Delinea achieves this level of reliability and dependable service through a combination of advanced architectural designs, proactive maintenance strategies, and continuous monitoring.
Delinea has been at the forefront of delivering Privileged Access Management (PAM) solutions in the cloud since 2015, when it introduced the industry's first SaaS-based privileged access service vault. More than a decade of SaaS design and operational experience allows us to evolve and mature our design and management practices. As a result, our customers can rely on secure and reliable identity security services that few competitors match.
Engineered for resilience with a containerized, microservices architecture
Delinea's architectural choices compound to support extreme reliability. The Delinea Platform is built using a containerized, distributed, microservices architecture (e.g., redundant servers, multiple data centers, automatic failover), meaning there’s no single point of failure. If one component or site fails, others instantly take over to keep services running.
To calculate Delinea’s end-to-end SLA, we combine the uptime values of all Delinea Platform components.
To achieve these commitments, Delinea has engineered the identity security platform using several best-in-class approaches, including:
Containerized microservices
With microservices, faults in one service (e.g., a malfunction in a reporting engine or audit log collector) don't cascade across the Delinea Platform.
Combined with container orchestration, this isolation leads to higher availability and better uptime, critical for identity security tools that must operate 24/7 with minimal service interruption.
Plus, containerized services scale horizontally. When usage spikes, such as when you need to onboard hundreds of users, sync entitlements from cloud directories, or run bulk session recordings, the platform can scale up relevant services without affecting others. This delivers consistent performance even under load.
Containers are immutable and ephemeral by design. That aligns well with zero trust principles: the platform can auto-heal, rotate services, and avoid persistent, long-lived attack surfaces. Delinea can also apply granular RBAC and network segmentation at the service level, reducing lateral movement risks within the platform itself. The architecture that enables faster feature delivery and innovation also allows security issues to be addressed in hours, not weeks or months.
Delinea's architecture benefits customers deploying workloads in Kubernetes, using DevOps pipelines, or building internal tools via APIs. Its services are modular and exposed via well-documented APIs, making it easier to plug identity security controls directly into CI/CD, cloud environments, and automation systems.
Active-active configuration
By adopting an active-active configuration, the Delinea Platform maintains multiple instances of our services operating concurrently across various locations. This setup supports immediate failover; if one instance encounters an issue, others seamlessly take over, maintaining continuous service availability. Such redundancy is critical in preventing disruptions that could impact users and operations.
Geo-replication
Geo-replication further enhances this resilience by distributing data across multiple geographic regions. It's worth noting that service providers can't reasonably quote higher than 99.95% uptime unless they explicitly engineer their service to support multiple regions.
This approach safeguards against regional outages and brings data closer to users, reducing latency and improving access speeds. For example, Delinea's deployment of the Delinea Platform spans seven geographies, with multiple regions and clusters within each region. This extensive distribution allows the system to reroute traffic to another operational region, even if an entire region faces an unexpected event, thereby upholding the stringent uptime SLA.
Zero-downtime upgrades and maintenance
In a rapidly evolving industry like cybersecurity, system updates happen frequently to incorporate new features and functionality, address vulnerabilities, and enhance system performance.
However, deploying updates can be risky for service continuity. If you must take your PAM systems offline to make updates, your workforce may not be able to access systems they need, service accounts may have trouble authenticating, or integrations may break.
To address this concern, sophisticated deployment pipelines that facilitate zero-downtime upgrades are needed, providing a customer experience with no service interruptions during the update process.
Microservices allow the Delinea Platform's development teams to deploy updates independently to different parts of the Delinea Platform. Critical new features—like updated policy controls, risk scoring, or integrations—can be pushed to production without waiting for massive version upgrades.
Our release methodology incorporates canary releases with rollback capabilities. In this approach, new updates are initially deployed to a small subset of microservices or users, allowing the Delinea Operations team to monitor the performance and stability of the changes in a controlled environment before releasing them to customers. If any issues are detected, the system can swiftly roll back to the previous stable version, mitigating potential impact.
This cautious deployment strategy enables Delinea to introduce continuous micro-releases and system-level upgrades for the Delinea Platform without disrupting service or requiring professional services support.
Proactive system management via monitoring, observability, and response
Attaining high availability also includes continuous system monitoring and rapid incident response.
Delinea uses advanced observability tools that provide real-time insights into system performance and health. These tools continuously evaluate and monitor top use cases to detect anomalies or potential issues before they escalate into significant problems.
Key elements of Delinea's proactive approach to system management for the Delinea Platform include:
- Real-time system health monitoring for key service endpoints, workloads, and infrastructure.
- Use case–driven observability, continuously evaluating top workflows to detect anomalies.
- Automated alerting with thresholds tuned to detect performance regressions and failure patterns early.
- Root cause analysis and feedback loops to reduce recurrence and optimize platform reliability.
Each customer has a dedicated, logically isolated tenant. We monitor each tenant individually, enabling us to report uptime metrics per tenant.
Delinea's dedicated Site Reliability Engineering (SRE) team complements these technologies. Their mission is to reduce the average time to detect and mitigate issues. They operate around the clock to monitor system performance and proactively address any emerging issues before they cause problems for customers.
Compare the Delinea Platform architecture to legacy SaaS
Legacy SaaS architecture | Delinea Platform architecture | |
Uptime | <= 99.95% | 99.995% |
Downtime during upgrades | 15+ minutes | None |
Scalability | Vertically with downtime Horizontal with configuration |
Auto-scale vertically and horizontally |
Code to production | 4+ weeks | 30 minutes average |
Microsoft Azure contributes to Delinea’s High Availability
Thus far, we've discussed how architecture and capabilities built directly into and used to support the Delinea Platform maintain our high uptime commitment.
In addition, Delinea Platform customers also benefit from the strategic use of Microsoft Azure's cloud platform to host the software and related components.
We leverage several essential capabilities that Azure provides to enhance the resilience and availability of the Delinea Platform—concepts that many web applications don’t use.
Availability zones and redundancy
These are physically separate locations within an Azure region, each with one or more data centers equipped with independent power, cooling, and networking. By deploying services across multiple availability zones, the Delinea Platform can withstand data center failures, enhance fault tolerance, and maintain high availability. This zonal deployment supports automatic failover and load distribution, which is critical to support uptime.
Additionally, each Delinea Platform instance is deployed in two Azure regions, both operating in an active/active capacity.
Geo-redundant storage (GRS)
To safeguard against regional outages, the Delinea Platform also benefits from Azure GRS, which replicates data asynchronously to a secondary region hundreds of miles away from the primary location. This geo-replication allows data to remain accessible and intact in the event of a regional disruption, supporting our commitment to data durability and availability.
Active geo-replication for databases
Azure provides Active Geo-Replication for database services to create readable secondary databases in different regions. In the event of a primary database failure, the system can quickly fail over to a secondary database, minimizing downtime and providing continuous service availability.
Regional hosting flexibility
With Azure's global infrastructure, Delinea can rapidly expand its Delinea Platform hosting capabilities to new regions. This allows our customers to experience low-latency, reliable access to services, regardless of geographic location. In addition to helping us meet uptime goals, the ability to deploy services closer to your end users enhances system performance and provides a seamless user experience.
Not all uptime claims are equal
While many identity and access security vendors promote high availability, it's critical to dig into the fine print.
Questions to ask any vendor about their uptime commitments
1. Is your uptime included in your MSLA?
Delinea includes 99.995% uptime for the Delinea Platform in our MSLA. Other vendors do not.
2. What’s your uptime track record?
Not all vendors publish their historical uptimes. Delinea believes the transparency forces accountability and the commitment to excellence.
In 2024, Delinea executed all upgrades and patches for the Delinea Platform without requiring planned downtime, updated 28 global Kubernetes clusters within one hour after a critical vulnerability was disclosed, and deployed security updates almost daily—all without disrupting customer access.
3. What scenarios are excluded from the SLA?
Delinea holds itself accountable for the Delinea Platform even when the underlying cloud infrastructure fails. Additionally, Delinea does not schedule planned downtime for updates to the Delinea Platform, unlike other vendors who regularly schedule downtime to perform updates.
4. How do you isolate tenants?
The Delinea Platform offers per-tenant isolation with individual encryption keys, uptime metrics, and observability at the tenant level.
5. What maintenance scheduling requirements do you have?
Unlike others who often push upgrades into vendor-controlled maintenance windows, Delinea has no maintenance rescheduling requirements for the Delinea Platform.
6. How do you back up the PAM vault for resiliency?
Delinea is the only PAM SaaS vendor offering near real-time vault backups to customer infrastructure, what we call “Resilient Secrets.”
6. What are your compensation models for downtime?
Some vendors offer minimal credits—e.g., just 10% credit for 88 hours of downtime per year—while excluding broad categories of disruption, and oftentimes not providing. contractual service credits tied to their SLA . Delinea supports our commitment to uptime with contractual service credits defined in our MSLA. Ask your PAM vendor for their MSLA commitments.
Platform architecture, operational rigor, and transparency matter
Before signing with any vendor, scrutinize the uptime claims and the real-world implications and track records behind them.
Delinea's commitment to a 99.995% uptime SLA for the Delinea Platform is supported by a strategy integrating resilient architectural design, seamless upgrade processes, proactive system management, and a decade of operational expertise.
By implementing active-active configurations, geo-replication, zero-downtime upgrade pipelines, and continuous monitoring with a dedicated SRE team, Delinea is confident in our ability to satisfy our uptime commitments.
