Certificate Observability for Manufacturing & OT
The core certificate observability strategy covers the SLO framework, risk scoring, dashboards, and discovery. Manufacturing and OT environments break the fundamental assumption underlying most observability tooling: that you can scan your network from a central point and discover all certificates.
Monitoring Certificates in Air-Gapped and Segmented OT Networks
Standard certificate discovery relies on network scanning — probing TLS endpoints and collecting certificate data. In OT environments, network scanning from the IT zone is either blocked by firewall rules, prohibited by OT security policy, or physically impossible (air-gapped networks).
Approaches that work in segmented OT environments:
PKI Health Radar
Drag the sliders to assess your current posture — scores update instantly.
Scan-from-within. Deploy a certificate scanner inside the OT zone. The scanner runs within the OT network, discovers certificates on OT devices, and exports the inventory to the IT observability platform via a controlled data transfer mechanism (data diode, one-way file transfer, or approved integration point). The scanner must be approved through the OT change management process and must not generate traffic patterns that could disrupt industrial protocols.
Agent-based discovery. For devices that support it (OT servers, historians, engineering workstations), deploy lightweight agents that report certificate inventory to a central collector. Agents must be validated for OT compatibility — they cannot consume resources that affect real-time process control, and they must be resilient to network disconnection (cache-and-forward model).
Passive network observation. Deploy a network tap or mirror port in the OT zone and passively capture TLS handshakes. Extract certificate data from the handshake without generating any traffic on the network. This is the least intrusive approach and the only viable option for some high-security OT environments — but it only discovers certificates on connections that are actively being established during the observation period.
Configuration management integration. For devices managed through industrial configuration platforms (Rockwell FactoryTalk, Siemens TIA Portal, etc.), certificate inventory data may be available through the configuration management system without requiring direct network scanning.
In practice, OT certificate observability typically combines multiple approaches: agent-based for servers and workstations, passive observation for industrial controllers, and manual inventory for devices that can't be reached by any automated mechanism.
Certificate Expiry as Production Outage: OT-Specific SLO Frameworks
In the core SLO framework, certificate expiry is a reliability metric. In manufacturing, it's a production metric. The SLOs must reflect this.
Lead time SLO for safety-critical devices. Certificates on devices that, if they fail, stop production should have a minimum 90-day alert threshold. This provides time for change management approval, maintenance window scheduling, and fallback planning.
Availability SLO for OT CA infrastructure. If OT devices depend on a CA or CRL distribution point within the OT zone, that infrastructure needs the same availability SLO as the production systems it supports. A CA outage that prevents certificate renewal for an industrial controller is a production risk.
Renewal success rate per device category. Track separately for IT-managed OT devices (servers, workstations) versus industrial controllers versus legacy devices. The renewal patterns and failure modes are different for each category, and a blended metric hides category-specific problems.
Integrating Certificate Alerting with SCADA and Industrial Dashboards
OT operations teams monitor plant status through SCADA/HMI displays, not Grafana dashboards. Certificate alerts that only appear in the IT observability stack won't reach the people who need to act on them in OT.
For production-impacting certificate events, alerts should propagate to the SCADA/HMI alarm system alongside other operational alerts. A certificate expiring in 7 days on a process controller is an operational alarm, not an IT ticket. The alert should include: the affected device and its production function, the certificate expiry date, the required action (renewal, replacement, or risk acceptance), and the contact for the PKI/IT team responsible for execution.
This integration requires cooperation between the IT security team (who understands certificates) and the OT operations team (who understands production impact). Neither team alone can build effective OT certificate observability.
Related in this cluster: Certificate strategy hub · Private PKI · mTLS · M&A PKI · Multi-CA · Revocation · Certificate Transparency · CAA & DNS trust · Kubernetes TLS · Edge TLS · Code signing · Observability · SCEP / NDES sunset.