Axelspire

Certificate Discovery in Practice

In every enterprise certificate estate we have audited, the number of certificates discovered in production exceeds the number recorded in the central CLM by between 20% and 60%. The gap is structural, not exceptional. Closing it — and keeping it closed — is the operational function called discovery.

Part of: Enterprise PKI Operating Model — the pillar page for the operations library.

Discovery is treated as a feature in most CLM marketing material. It is not a feature. It is a discrete operational domain with its own data flows, its own failure modes, its own measurement criteria, and its own dedicated tooling — usually multiple tools, because no single source has complete visibility.

Most teams treat discovery as a project. It is not. It is a continuous operational function that most organisations under-fund until the first major incident. After the incident, funding appears for one budget cycle, the immediate backlog gets cleared, and within twelve months the gap has reopened — because nobody changed the underlying staffing model. The pattern repeats every time leadership changes.

Featured Tool Runs fully in-browser

PKI Health Radar

Drag the sliders to assess your current posture — scores update instantly.

Why the gap exists

Certificates enter production through paths the CLM doesn't see. The patterns are predictable:

Direct issuance from cloud-native CAs. AWS Certificate Manager, Google-managed certificates in GCP, Azure Key Vault certificates. Each cloud provider issues certificates that often bypass the central CLM entirely. The application teams using these certificates are not deliberately routing around the CLM — they are using the path of least resistance their cloud platform offers.

Self-signed certificates in internal traffic. Service-to-service mTLS, internal admin interfaces, monitoring agents, container ingress. Self-signed certificates are functionally invisible to most CLMs because they were never issued by an authority the CLM is integrated with.

Legacy certificates from departed teams. Acquisitions, decommissioned products, deprecated services. Certificates were issued at the time, the team that owned them moved on, the certificate is still deployed and still working until it expires unexpectedly.

Shadow IT and unauthorised CAs. Certificates issued from internal CAs that were stood up for a project, never decommissioned, and continue issuing certificates that no-one with central PKI responsibility knows about.

Certificates issued and deployed correctly but not reconciled. The CLM issued them, but the integration with the asset management system or the platform inventory failed silently. The certificate exists in the CLM database and exists on the server, but the link between them was lost.

Each of these paths produces a different type of gap. Closing the gap requires multiple discovery sources, because no single source covers all paths.

The three discovery modalities

Figure 1. The three certificate discovery modalities and what each one alone misses. Network scanning misses certificates that are issued but not currently being served. Issuance reconciliation misses certificates from unapproved or shadow CAs. Asset discovery misses certificates that are not referenced in configuration. The reconciled view in the centre is what the operating model has to construct.
Figure 1. The three certificate discovery modalities and what each one alone misses. Network scanning misses certificates that are issued but not currently being served. Issuance reconciliation misses certificates from unapproved or shadow CAs. Asset discovery misses certificates that are not referenced in configuration. The reconciled view in the centre is what the operating model has to construct.

Network discovery (active scanning). The discovery system scans network ranges, opens TLS connections to each responding host, and records the certificate presented. This finds certificates that are actively in use on network endpoints — including those issued outside the CLM. It does not find certificates that are deployed but not currently serving on a scanned port. It does not find certificates inside containers, in client-side stores, behind load balancers (where the LB presents its own certificate, not the backend services'), or behind firewalls the scanner cannot reach. It is the foundational discovery technique and the most operationally noisy.

Tools: Bitsight, Tenable, Qualys, custom Nmap-based scanners, Rapid7, vendor-specific platforms.

Issuance reconciliation (CA-side). The discovery system queries the issuance records of every CA the organisation uses — public (Sectigo, DigiCert, Let's Encrypt), private (internal Microsoft AD CS, EJBCA, Vault PKI, AWS Private CA), and managed (cloud-provider certificate services). This finds every certificate the organisation has ever asked for, including those that were issued and never deployed.

The combination of issuance reconciliation and network discovery identifies the two halves of the gap: certificates issued but not deployed (waste, possible compromise), and certificates deployed but not in the issuance record (shadow PKI, security risk).

Asset and configuration discovery. The discovery system queries asset management (CMDB), platform configurations (AWS Config, Azure Resource Graph, Kubernetes API), and load balancer / web server configurations directly. This finds certificates referenced in configuration even if they are not currently being served. It is the only modality that catches certificates configured for failover scenarios that haven't activated.

Mature operations use all three modalities and reconcile the outputs. Most organisations use one — usually network discovery — and treat the resulting view as complete. It isn't.

Coverage estimate for
Mid-size financial services firm
71%
of estate covered
Confidence band: 5686%

Per-modality contribution

Network discovery 34%
Issuance reconciliation 69%
Asset & configuration (not enabled)59%
Cloud-provider native (not enabled)49%

Recommendations

highAdd asset discovery via the orchestration platform

With 30% of your estate containerised, network scanning misses much of the certificate inventory because containers come and go faster than scan cycles. Query the Kubernetes API (or ECS / AKS / EKS APIs) directly for certificates referenced in configuration. This is the primary modality for containerised estates, not network scanning.

mediumAdd cloud-provider native discovery

With 30% of your estate in cloud-native services, AWS Config, Azure Resource Graph, and GCP Cloud Asset Inventory each surface certificates that other modalities can miss — particularly certificates managed by the cloud provider itself (ACM, Azure Key Vault, GCP Certificate Manager).

mediumInvest in the reconciliation function, not just modalities

You have multiple discovery modalities running. The next operational gain is from reconciliation — taking the outputs of each modality and producing a single authoritative view, with anomalies routed to the operations team on a defined cadence. Discovery without reconciliation is data without insight.

Book a 30-minute discovery review

Reconciliation: the operational core

Discovery without reconciliation is data without insight. The reconciliation function takes the outputs of all discovery modalities and produces a single authoritative view of the certificate estate. Every certificate in that view has answers to four questions:

  • Which CA issued it (and is that CA in our approved list)?
  • Where is it currently deployed (network, host, service)?
  • Who owns the service it protects (RACI to a specific team)?
  • When was it last validated as in-use (and is that recent enough)?

Certificates that cannot answer all four questions are anomalies. The operations team works through anomalies on a defined cadence — typically weekly for high-volume estates, monthly for smaller ones. Anomalies fall into a small number of categories:

Issued but not deployed. Either the deployment failed silently and there is now an unprotected service, or the certificate was issued for a service that no longer exists and should be revoked.

Deployed but not issued (by an approved CA). Either an unauthorised CA is in use, or a certificate was issued by a CA that was approved historically but is no longer on the list, or a self-signed certificate exists that should have been replaced.

Deployed and issued, but no owner. The certificate is real, current, and in production, but no team is accountable for it. This typically reflects an organisational gap — a team that was decommissioned without proper handover.

Last seen long ago. The certificate was discovered historically but has not been observed recently. Either the service has been retired without revocation, or the discovery scope no longer covers it.

Each anomaly category has a defined remediation path. The operations team's effectiveness on discovery is measured by how quickly anomalies move through their lifecycle — discovered, classified, remediated, closed.

Calculation for
Under-funded team
Utilisation600%
0%target 70%100%200%
Severely over-capacity
Anomalies / week
450
Hours required / week
150.0
Team capacity / week
25 h

Recommended auto-close rules

1. Recurring-renewal pattern60% volume
auto_close WHERE matches_previous_cert(same_FQDN, same_SAN_set, issued_by(approved_CA), issued_within_last_60_days)
Projected utilisation after this rule: 240%
2. Self-issued via approved automation25% volume
auto_close WHERE issuer IN approved_internal_CAs AND requested_via(approved_automation_pipeline)
Projected utilisation after this rule: 180%
3. Approved-CA recent issuance20% volume
auto_close WHERE issuer IN approved_CAs AND matches_issuance_record(within_7_days)
Projected utilisation after this rule: 144%
4. CT-logged from approved public CA15% volume
auto_close WHERE matches_CT_log_entry AND issuer IN approved_public_CAs AND validity_period_compliant
Projected utilisation after this rule: 122%
5. Test/dev environment patterns10% volume
auto_close WHERE FQDN matches(test_dev_pattern_regex) AND environment_tag IN ["test","dev","staging"]
Projected utilisation after this rule: 110%
6. Known-deprecated services list5% volume
auto_close WHERE FQDN IN deprecated_services_list AND status = "expected_to_expire"
Projected utilisation after this rule: 105%
Rules alone are insufficient. Even with all recommended auto-close rules applied, projected utilisation remains at 105%. The team needs approximately 0.5 additional FTE to bring utilisation to a sustainable level — or further classification work to reduce the residual anomaly rate.
Book a 30-minute conversation

Parameterised discovery — what determines the right approach

Three parameters determine the right discovery configuration for your environment:

Network topology and scan reachability. Highly segmented networks (typical of regulated environments) need distributed scanners with network-layer access, which adds operational cost. Flat networks (typical of cloud-native or smaller organisations) can use centralised scanning. Estates that span on-premise and multiple clouds need scanners in each environment, because cross-environment scanning is rarely permitted by network policy.

Estate composition. Containerised estates (Kubernetes, ECS, AKS, GKE) need API-driven discovery via the orchestration platform; network scanning misses much of the certificate inventory because containers come and go faster than scan cycles. Traditional VM-based estates respond well to network scanning. Mixed estates need both.

Front-end concentration. The proportion of services behind load balancers, ingress controllers, or service mesh changes the discovery calculus significantly. Front-end concentration above 30% means that network scanning sees only the front-end certificates (the LB or ingress termination point) and misses the certificates in use on the backend services and within the mesh. Issuance reconciliation becomes the dominant modality for these estates because it is the only one that captures every certificate the CAs have produced regardless of where they ended up.

Discovery cadence. How frequently should the estate be discovered? Daily is the default for the issuance reconciliation modality (cheap, low-impact, high-value). Weekly is typical for network scanning (more invasive, more likely to trigger IDS alerts, more bandwidth). Asset discovery cadence matches the change rate of the underlying configuration. Cadence is a parameter; the right value depends on the velocity of change in your estate.

Discovery tooling — when each is appropriate

The tooling landscape is fragmented because no single product covers all three modalities well. The honest mapping:

For network discovery at enterprise scale, dedicated certificate discovery products (Sectigo SCM Discovery, Venafi TLS Protect's discovery component, Keyfactor's discovery features, AppViewX) work but are typically over-priced for the function. Open-source options (Censys for external, custom Nmap-based scanning for internal) are operationally heavier but more controllable.

For issuance reconciliation, the right answer is API-direct integration with each CA. Public CAs expose APIs (often poorly documented). Private CAs vary widely. Cloud-managed certificate services have native APIs that are usable but have rate limits and pagination quirks worth understanding before scaling.

For asset discovery, the right answer is the platform-native API in each environment, integrated with the central reconciliation pipeline. AWS Config, Azure Resource Graph, GCP Cloud Asset Inventory, the Kubernetes API. These are well-documented and well-supported but require integration work that platforms don't provide.

The mature pattern is a discovery aggregator — sometimes built in-house, sometimes purchased — that pulls from all three modalities, runs the reconciliation logic, and feeds anomalies into the operations workflow. This pattern doesn't appear in any single vendor's marketing. It is what the operations function has to build, regardless of the platform underneath.

Where discovery breaks

Confusing scan results with deployment reality. A scanner records the certificates it observed at the time of scan. A certificate that wasn't being served at scan time is invisible until next scan. Treating scan results as complete inventory produces false confidence. The fix is to combine active scanning with passive observation and expected-state data from the CLM, and to report coverage explicitly rather than implying it.

Reconciliation as a one-time exercise. Discovery is run, anomalies are identified, the team works through them, the project closes. Six months later, the gap has reopened. The fix is to treat discovery as operational rather than project-based — it has to run continuously and the anomaly backlog has to be worked continuously.

No remediation budget. Discovery finds problems. Remediating them takes engineering time. If the operations team has discovery tooling but no budget for the remediation work, anomalies accumulate. This is a governance failure dressed as a discovery failure. The fix is to fund remediation capacity alongside the discovery tooling, not as an afterthought.

Alert fatigue from poor classification. Discovery systems that produce thousands of alerts per scan get ignored. The fix is reconciliation logic that classifies aggressively — known-good patterns (renewals from approved CAs, expected new issuance for new services) auto-closed, leaving only genuine anomalies for human review.

Maturity progression for discovery

The five-level PKI operational maturity model introduced in the pillar maps onto the discovery domain as follows.

Level 1 — Ad-hoc. No central discovery function. Certificates are surfaced when they fail. The estate is whatever is currently working plus whatever has expired.

Level 2 — Tooled. A single discovery modality exists, usually network scanning from one product. Coverage is partial; gaps are not measured. Discovery output is occasionally compared to the CLM but reconciliation is manual and infrequent.

Level 3 — Operationalised. Multiple discovery modalities are running on defined cadences. Reconciliation produces a unified view. Anomalies are classified and worked through on a defined cadence. The discovery function is recognisable as a function.

Level 4 — Integrated. Discovery feeds asset management, change management, and incident response. Anomaly remediation is part of the engineering backlog. Discovery coverage is measured and reported. New services are onboarded with discovery as part of the integration pattern.

Level 5 — Intelligent. Discovery is predictive — patterns in the discovery output identify drift before it becomes a problem. Discovery feeds threat intelligence, vulnerability management, and capacity planning. The estate is known with high confidence, and the confidence is itself measured.

Most organisations are between level 1 and level 2 on discovery, often with a level-2 tool deployed but level-1 operations around it. The progression to level 3 is achievable within six months once multiple modalities are running and reconciliation is treated as a continuous operational task.

Further reading within this cluster