Axelspire

Enterprise PKI Operating Model: How to Run Certificate Operations at Scale

An enterprise PKI operating model defines who is responsible for certificate operations, what they do, when they do it, and how the work is measured. It is distinct from — and prior to — the technology platform that supports those operations. Most organisations build the platform first and assume the operating model will emerge from use. It does not.

This page describes the operational structure required to run certificate management at enterprise scale, the eleven domains that comprise it, and the maturity progression that determines whether a CLM investment delivers its expected value.

Featured Tool Runs fully in-browser

PKI Health Radar

Drag the sliders to assess your current posture — scores update instantly.

What an operating model is, and what it isn't

A certificate management platform — Keyfactor, Venafi, AppViewX, EJBCA, Vault PKI, AWS Private CA — automates issuance. That is one thing. An operating model is the answer to a different question: who is accountable for certificate operations across the organisation, what processes are followed, how decisions get made, what happens when something fails, and how the work scales as the estate grows.

Five components define an operating model:

Governance. Where authority sits, how policy is set, how exceptions are approved, who owns the budget.

Technology. The platforms, integrations, automation patterns, and data flows that execute the work.

People and roles. The team structure, RACI, joiner/leaver processes, escalation paths.

Process. The workflows for issuance, renewal, revocation, incident response, change, onboarding.

Measurement. The KPIs and SLOs that determine whether the model is working.

A platform addresses one of these — technology — and partially. A complete operating model addresses all five. The gap between platform deployment and complete operating model is where most enterprises lose the value they paid the vendor for.

The symptom is recognisable: a CLM platform was deployed eighteen months ago, the team can demonstrate it issuing certificates, and certificate-driven incidents are still happening. The platform works. The operations don't.

The operational decomposition

Certificate operations decompose into eleven domains. Each is a discrete operational function with its own workflow, its own failure modes, its own measurement. An operating model defines all of them; a tool covers some of them.

Figure 1. The eleven operational domains of an enterprise PKI operating model. Each is a discrete function with its own workflow, failure modes, and measurement criteria. A platform addresses one of the five layers — technology — and only partially; an operating model addresses all eleven domains across all five layers.
Figure 1. The eleven operational domains of an enterprise PKI operating model. Each is a discrete function with its own workflow, failure modes, and measurement criteria. A platform addresses one of the five layers — technology — and only partially; an operating model addresses all eleven domains across all five layers.

1. Certificate governance

Where PKI policy is set and how it cascades into operations. Steering function, ownership at the executive level, exception handling, the link between cryptographic policy and engineering practice. Without governance, every team makes local decisions that don't compose at the enterprise level.

Full guide: Certificate Governance and the Steering Function

2. Certificate discovery

The gap between certificates issued and certificates deployed is structurally large in any enterprise — typically 20–40% in our experience, sometimes higher. Discovery is the operational function that finds the gap. Network scanning, issuance reconciliation, asset-inventory integration, the difference between active discovery and passive monitoring.

Full guide: Certificate Discovery in Practice

3. Certificate issuance workflows

There are two primary issuance patterns: combined (developer-driven, automated end-to-end) and admin-mediated (request, approve, fulfil). Mature organisations run both — the wrong pattern for a given service profile creates either bottlenecks or compliance gaps. Choosing per-service is part of the operating model.

Full guide: Certificate Issuance Workflows

4. Certificate renewal operations

Renewal is operationally distinct from issuance. Issuance creates a new asset; renewal replaces a live one without service disruption. The renewal buffer — how far before expiry the renewal must complete — is the parameter that determines whether you have a renewal process or a recurring incident.

Full guide: Certificate Renewal Operations

5. Certificate revocation

Revocation is the part of the operating model that gets ignored until it can't be. Its blast radius is determined by how many services share the certificate, how revocation is checked downstream (CRL, OCSP, OCSP stapling, none), and whether the operations team can replace before they revoke. The mature pattern is replace-then-revoke; the immature pattern is revoke-then-scramble.

Full guide: Certificate Revocation at Operational Scale

6. Platform onboarding

The pattern for adding new technology platforms — AWS, Azure, GCP, Active Directory Certificate Services, Kubernetes, application-specific certificate stores — to a central CLM. Each platform has its own integration surface and its own failure modes. The operating model defines a consistent onboarding process so that the eleventh platform is no harder than the third.

Full guide: Platform Onboarding for Certificate Automation

7. Trust-store management

The set of root and intermediate CA certificates trusted by every service in the estate. Updates to trust stores are infrequent, high-impact, and almost always managed by no-one specifically. The operating model treats trust-store updates as a first-class change event with their own workflow.

Full guide: Trust-Store Management

8. Operational vs security logging for PKI

Two distinct audiences with two distinct data flows. Operational logging — for the team running the service — focuses on issuance latency, failure rates, queue depth. Security logging — for SOC and compliance — focuses on policy violations, anomalous issuance, mis-issuance detection. Conflating these into one dashboard produces noise for both audiences and signal for neither.

Full guide: Operational vs Security Logging for PKI

9. Certificate incident management

The certificate-expiry incident is operationally distinct from a generic service incident. It has a different cause profile, a different remediation playbook, and a different RACI. The mature operating model recognises certificate incidents as a category and handles them with category-specific runbooks.

Full guide: Certificate Incident Management

10. Change management for PKI operations

Certificate changes — new templates, validity period adjustments, CA rotations, root updates, profile modifications — sit awkwardly in standard change management because they touch hundreds or thousands of services simultaneously. The operating model defines which changes are standard (pre-approved, automated), which are normal (reviewed, scheduled), and which are emergency (approved at incident speed).

Full guide: Change Management for PKI Operations

11. PKI support model and RACI

The team structure, on-call rotation, escalation path, and ownership matrix that makes the rest of the operating model executable. The smaller the team, the more critical the RACI — because a five-person team with unclear ownership produces the same outcome as a fifty-person team with no ownership at all.

Full guide: PKI Support Model and RACI

The operational maturity model

Most organisations are at level 2 and believe they are at level 4. Recognising the gap is the first step toward closing it.

Figure 2. The five-level PKI operational maturity progression. Tools take an organisation from level 1 to level 2; operating models take an organisation from level 2 to level 4. The progression from level 2 to level 4 is the value-realisation gap for any CLM investment — and the journey no platform vendor can sell.
Figure 2. The five-level PKI operational maturity progression. Tools take an organisation from level 1 to level 2; operating models take an organisation from level 2 to level 4. The progression from level 2 to level 4 is the value-realisation gap for any CLM investment — and the journey no platform vendor can sell.

Level 1 — Ad-hoc. Certificates are managed per-service by the teams that need them. There is no central inventory. Issuance happens on request, often by ticket. Renewal is calendar-driven and frequently missed. Outages are routine and treated as individual incidents rather than systemic ones. Most organisations spend more time in level 1 than they admit.

Level 2 — Tooled. A CLM platform has been deployed. Issuance is partially automated for some service classes. Discovery exists but is incomplete. Renewal is mostly automated for the services that have been onboarded; the long tail is still manual. Incidents have decreased from level 1 but have not disappeared. The team running the platform is small and overloaded. This is where most enterprises sit.

Vendors describe customers in level 2 as success stories. The customers describe themselves as still firefighting. Both descriptions are accurate; they are measuring different things. The vendor measures issuance automation; the customer measures incidents that haven't stopped happening.

Level 3 — Operationalised. The platform is integrated with the broader operating model. Issuance, renewal, revocation, and onboarding follow defined processes. Discovery covers the full estate with reconciled accuracy. The RACI is clear and exercised. Incidents are categorised, runbooks exist, MTTR is measured and improving. The team is appropriately sized and has defined coverage. The platform is a tool used by the operations function, not a substitute for it.

Level 4 — Integrated. Certificate operations are integrated with adjacent functions: change management, incident management, asset management, identity management, vulnerability management. Certificate data flows into and out of the broader operational stack. The CISO has visibility. The CFO has cost transparency. The engineering teams have self-service. Compliance has audit trails. The operating model is not a separate thing — it is part of how the organisation runs.

Few enterprises reach level 4. The ones that do reached it deliberately, over multiple years, with explicit programme investment beyond the initial CLM purchase. Level 4 is not the next release of the platform; it is what the operating model produces once it has matured for long enough that integration with adjacent functions becomes natural.

Level 5 — Intelligent. The certificate estate produces operational intelligence. Discovery feeds asset management. Renewal patterns inform capacity planning. Revocation events feed threat intelligence. Trust-store changes are correlated with incident impact across the broader infrastructure. The certificate layer becomes a source of insight about the rest of the business, not just a cost centre. Few organisations are here. Those that are tend not to talk about it.

The progression from level 2 to level 4 is the value-realisation gap for any CLM investment. Tools take you from level 1 to level 2. Operating models take you from level 2 to level 4. Most enterprises pay for the second journey and never make it, because the journey requires operational investment that no platform vendor can sell you.

Where this framework comes from

The operating model described on this page reflects how we have built and run certificate operations across regulated UK enterprises, including a deployment that grew from six services to over one hundred without a certificate-driven incident, and now operates at over 127,000 certificates under automation. The same patterns have been applied to financial services, telecommunications, and broadcast media estates, and to mixed-CA architectures spanning private and public certificate authorities.

What makes the model durable across organisations is not the specific tool choices, the specific volume thresholds, or the specific team sizes. It is the operational decomposition itself — the recognition that certificate management has eleven distinct domains, each with its own structure, and that an operating model has to address all of them.

Where to start

Before deploying anything, evaluating any vendor, or scoping any programme, three things will produce more clarity than any platform decision:

Inventory your current sources of issuance and discovery. How many CAs, how many certificate stores, how many automation tools, how many discovery sources exist today across the organisation. Most enterprises discover the answer is two to three times what they expected. The gap between expectation and reality is the size of the operations problem.

Draft a one-page RACI. For each of the eleven domains above, name the role accountable, the role responsible, the role consulted, the role informed. Do this with the actual teams, not with the org chart. Where the boxes are empty or contested is where the operations problem lives.

Identify the single highest-volume issuance bottleneck. The service or platform where certificate issuance is slowest, most manual, or most error-prone. That is your first operational target. Not the hardest, not the most strategic — the highest-volume. Solving it produces measurable value within a quarter and provides the operational pattern for solving the next one.

These three exercises produce more useful information than any vendor evaluation, and they produce it in days rather than months.

The eleven spoke pages linked above each describe one operational domain in detail, with the parameters that determine the right answer for your specific situation, and the interactive tools that turn those parameters into a recommendation. Start with the domain that is causing the most pain right now.

Further reading within this cluster