Kubernetes TLS Strategy: cert-manager Is Not Your Strategy

cert-manager is the fourth most used CNCF project after Kubernetes itself, etcd, and CoreDNS, according to the CNCF's 2025 annual survey. It is, by any measure, the default tool for managing TLS certificates in Kubernetes. Over 90% of organisations report using or evaluating Kubernetes for production workloads. cert-manager is how most of them handle certificates.

Many clusters and Issuers fragment quickly—cert-manager automates tasks; strategy defines trust boundaries and ownership.

And therein lies the problem. cert-manager is an excellent tool. It automates certificate issuance, renewal, and secret management within Kubernetes clusters. It supports ACME, Vault, Venafi, and a growing list of CA backends. For a single team running a single cluster, it works beautifully.

For an enterprise running dozens of clusters, managed by multiple teams, with certificates issued from different CAs for different purposes — cert-manager deployments begin to fragment. Each team configures their own Issuers. Nobody has a complete picture of what's been issued across clusters. Certificate policies exist in team documentation, not in enforced configuration. The result is decentralised certificate management that technically works but strategically doesn't.

Featured Tool Runs fully in-browser

PKI Health Radar

Drag the sliders to assess your current posture — scores update instantly.

6 more tools: Cost & Risk Explorer Timeline Builder Shadow Heatmap Process Transform Slider Scenario Comparator What-If Demo All tools & guide →

The platform TLS strategy sits above cert-manager. It answers the questions that cert-manager doesn't: what should issue certificates, for whom, with what constraints, and how do we maintain visibility across the entire estate?

TLS in Kubernetes: Three Layers, Three Strategies

Kubernetes environments have three distinct TLS surfaces, and each requires a different architectural approach.

North-south TLS (ingress). Traffic entering the cluster from external clients — browsers, mobile apps, API consumers. This is the most visible TLS surface and the one with the strictest requirements: certificates must be publicly trusted, comply with CA/Browser Forum Baseline Requirements, and be served with correct chains. Ingress TLS is where the 47-day certificate validity timeline hits hardest, because these are the certificates that public CAs issue and browsers validate.

The architectural decision at the ingress layer is where TLS terminates. At the Ingress Controller or Gateway API resource? At an external load balancer? At a CDN? Each choice has implications for certificate management, private key exposure, and integration with your automation. If TLS terminates at a cloud load balancer outside the cluster, cert-manager may not manage those certificates at all — your cloud provider's certificate management (AWS Certificate Manager, Google-managed certificates) handles them instead.

For clusters that terminate TLS at the ingress controller, cert-manager with an ACME ClusterIssuer pointed at a public CA (Let's Encrypt, DigiCert, Sectigo) is the standard pattern. The strategic decisions are: which public CA, what's the failover CA (see Multi-CA Strategy), and how do you handle certificate issuance for domains managed by different teams within the same cluster.

East-west TLS (service mesh mTLS). Traffic between services within the cluster. Service meshes — Istio, Linkerd, Cilium — handle east-west mTLS by default, issuing short-lived certificates from their own internal CA for every service-to-service connection. This is arguably the most certificate-dense layer: a busy service mesh can issue thousands of certificates per hour, each with lifetimes measured in hours.

The strategic question here is whether mesh-managed TLS replaces or complements your PKI strategy. Service meshes typically operate their own root CA, independent of your organisation's private PKI. This means: the trust anchor is mesh-specific (not your enterprise root CA), the certificates don't appear in your certificate inventory or observability tooling, revocation is handled by the mesh (or not at all — short lifetimes make it unnecessary), and the security posture of your east-west encryption is entirely dependent on the mesh's CA implementation.

For many organisations, this is acceptable. The mesh handles mTLS transparently, the certificates are ephemeral, and the operational overhead is minimal. For organisations with specific compliance requirements around CA governance, key management, or certificate audit trails, mesh-managed TLS may need to integrate with your enterprise private PKI. Istio, for example, supports pluggable CA backends, allowing you to replace its built-in CA with cert-manager, Vault, or a custom CA that chains to your enterprise root.

Application-level TLS. Some applications manage their own TLS, separate from ingress and mesh. Database connections (PostgreSQL, MySQL with TLS), message brokers (Kafka with mTLS), and specific application protocols may require dedicated certificates with specific subject names, extensions, or key types. These certificates often live outside cert-manager's default scope and require explicit integration.

cert-manager Architecture Decisions

Once you've mapped your three TLS layers, the cert-manager configuration decisions become clearer.

Issuer hierarchy. cert-manager supports Issuers (namespace-scoped) and ClusterIssuers (cluster-wide). The strategic choice between them determines how much control individual teams have over certificate issuance. ClusterIssuers enforce consistency — every namespace uses the same CA configuration. Namespace-scoped Issuers allow teams to configure their own CAs, which provides flexibility but can lead to fragmentation.

For most enterprises, the recommended pattern is: ClusterIssuers for standard use cases (public ACME for ingress, internal CA for service certificates), with namespace-scoped Issuers only where teams have a legitimate need for a different CA configuration. This keeps the default path consistent while accommodating exceptions.

CA backend selection. cert-manager supports multiple issuer types: ACME (for public CAs and internal ACME-compatible CAs), Vault/OpenBao (for organisations using HashiCorp's PKI secrets engine), Venafi (for organisations with Venafi certificate management), self-signed and CA issuers (for development/testing or simple internal use), and external issuers via the External Issuer API.

The backend choice should align with your broader certificate strategy. If you've invested in private PKI using Vault, use the Vault issuer in cert-manager. If you're standardised on ACME for both public and internal certificates, use ACME issuers with different endpoints for different CAs. The anti-pattern is allowing each cluster or team to choose independently — that's how you end up with five different CA backends across twenty clusters, none of them consistently configured.

Certificate policies. cert-manager issues whatever certificates your Issuer configuration and Certificate resources request. It does not enforce organisational policies on key size, lifetime, allowed domains, or naming conventions — unless you add policy enforcement.

Kubernetes-native policy engines (OPA/Gatekeeper, Kyverno) can validate Certificate resources before they're applied, rejecting requests that violate your organisation's certificate policy. This is the mechanism that turns cert-manager from a tool into a governed platform capability. Policies worth enforcing: minimum key size, maximum certificate lifetime, allowed DNS name patterns (prevent issuance for domains outside your ownership), required Issuer reference (prevent teams from pointing at unapproved CAs), and annotation requirements for ownership and purpose tracking.

Multi-Cluster Certificate Consistency

The hardest problem in Kubernetes TLS strategy isn't making certificates work in one cluster — it's making them work consistently across all of them.

When every cluster has its own cert-manager installation, Issuer configuration can drift. A new cluster might be deployed with a different ACME endpoint, a different CA, or different certificate defaults. Nobody notices until a certificate-related incident reveals the inconsistency.

The platform engineering approach to this is treating cert-manager configuration as infrastructure-as-code. Issuer and ClusterIssuer resources are defined centrally, version-controlled, and deployed to clusters via GitOps (ArgoCD, Flux). Changes to certificate policy propagate across all clusters through the same mechanism that deploys application workloads.

For organisations with strict multi-cluster governance, tools like the cert-manager CSI driver or trust-manager can help standardise trust anchor distribution and certificate mounting across clusters. The CNCF's trust-manager project specifically addresses the challenge of distributing CA bundles consistently across namespaces and clusters.

Observability Across Clusters

Decentralised cert-manager deployments create a visibility gap that your certificate observability strategy needs to address.

At minimum, platform teams should aggregate certificate inventory across all clusters. cert-manager exposes Prometheus metrics for certificate status, expiry, and issuance success/failure. Aggregating these metrics in a central monitoring platform (Grafana, Datadog, or your organisation's observability stack) provides the unified view that individual cluster monitoring can't.

The metrics that matter for platform TLS: certificate expiry time across all clusters (not just per-cluster), issuance success rate by Issuer and cluster, certificate renewal latency (time between renewal trigger and new certificate availability), and orphaned certificates — certificates that exist in secrets but aren't referenced by any Ingress, Gateway, or application resource.

Alerting should distinguish between platform-level issues (an Issuer is failing across multiple clusters — likely a CA or network problem) and application-level issues (a single certificate is failing to renew — likely a misconfiguration in one deployment).

When Service Mesh mTLS Replaces Your PKI Strategy

For organisations adopting Istio, Linkerd, or Cilium for service mesh, there's a temptation to treat the mesh's built-in mTLS as "done" — east-west encryption is handled, certificates are managed automatically, and there's nothing more to think about.

For most workloads, that's fine. Mesh mTLS is transparent, automatic, and well-tested. But there are scenarios where the mesh's TLS management needs to integrate with your broader certificate strategy rather than operate independently.

Compliance and audit. If your regulatory environment requires audit trails for all certificate issuance, the mesh's internal CA may not produce the logging and audit artefacts your compliance team needs. Integrating the mesh with your enterprise CA (via cert-manager as an intermediary, or via the mesh's pluggable CA interface) brings mesh certificates under your existing governance.

Cross-mesh and cross-cluster trust. When services communicate across cluster boundaries or mesh boundaries, the trust model becomes more complex. Two Istio meshes with independent root CAs can't authenticate each other's services without cross-root trust configuration. An enterprise root CA that issues intermediate certificates to each mesh simplifies this.

Non-mesh workloads. Not everything runs in the mesh. VMs, legacy applications, external services, and database connections may need certificates from the same trust hierarchy as mesh services. An enterprise CA that serves both mesh and non-mesh workloads provides a consistent trust foundation.

Zero-trust architecture. If your organisation is implementing zero trust beyond the mesh — extending mTLS to API gateways, external partners, or client devices — the mesh's internal CA is insufficient. You need an enterprise mTLS strategy that encompasses the mesh as one component of a larger trust architecture.

The Platform Engineering Framing

Kubernetes TLS is a platform engineering problem, not a security problem. Security defines the requirements — what must be encrypted, authenticated, and auditable. Platform engineering delivers the capability — the automation, policies, observability, and tooling that make those requirements achievable at scale.

The best Kubernetes TLS strategies treat certificate management as a platform service. Teams consume certificates through standard interfaces (cert-manager Certificate resources, mesh sidecar injection, API gateway configuration). The platform team operates the underlying infrastructure — CA backends, policy enforcement, observability, and cluster-wide consistency.

This separation is what makes certificate management sustainable in Kubernetes environments. Without it, every team reinvents their own certificate management, and the organisation's certificate strategy is whatever those teams happen to do. With it, certificates become infrastructure that's as reliable and consistent as compute, storage, and networking.

Sector context: edge and plant workloads often sit next to manufacturing & OT private PKI and healthcare internal trust.

Platform governance above the cluster: consistent roots, issuance policy, and visibility across teams and environments.

← Back to Certificate Strategy: The Framework Most Organisations Skip

External References

cert-manager Documentation — cert-manager.io
CNCF Annual Survey 2025: cert-manager as Top 5 CNCF Project — cncf.io
trust-manager: CNCF Trust Anchor Distribution for Kubernetes — cert-manager.io/docs/trust/trust-manager
Istio Security: Pluggable CA — istio.io
Kubernetes Gateway API — gateway-api.sigs.k8s.io
Kyverno Policy Engine — kyverno.io
OPA Gatekeeper — open-policy-agent.github.io/gatekeeper
CNCF Survey: 96% of Organisations Using/Evaluating Kubernetes — cncf.io