ACME DNS-01 Challenge: Complete Setup & Troubleshooting Guide
TL;DR: DNS-01 is the ACME challenge method for validating domain ownership through DNS TXT records, enabling wildcard certificates, private network deployments, and certificate issuance without HTTP server access. Implementation requires DNS provider API integration, careful DNS propagation handling, and secure credential management for production automation.
Getting Wildcard Certificates with DNS-01
DNS-01 is the ONLY way to get wildcard certificates from Let's Encrypt. HTTP-01 and TLS-ALPN-01 cannot issue wildcards.
Single command for wildcard certificate:
certbot certonly --dns-cloudflare \
--dns-cloudflare-credentials ~/.secrets/cloudflare.ini \
-d example.com \
-d *.example.com
This certificate covers:
- example.com (apex domain)
- *.example.com (all subdomains: www, api, blog, mail, etc.)
- Works for 100+ subdomains with one certificate
- Renews automatically without HTTP server access
Why wildcards need DNS-01: Let's Encrypt must validate you control the DNS zone to issue *.example.com. HTTP-01 would require placing validation files on infinite subdomains—impossible. DNS-01 proves zone control with a single TXT record.
TXT Record Format (_acme-challenge)
Every DNS-01 validation requires a TXT record at _acme-challenge:
| Field | Value | Example |
|---|---|---|
| Record Name | _acme-challenge.example.com |
_acme-challenge.api.company.com |
| Record Type | TXT |
TXT |
| Record Value | Token from Certbot | "9G8F7K3LmN2pQ1rS5tU8vW0x4yZ6..." |
| TTL | 60 seconds (recommended) | 60 |
For wildcard certificates, add TXT record for the domain itself:
For specific subdomain certificates, add TXT record for that subdomain:
Manual TXT record creation (if not using DNS plugin):
# 1. Start certificate request
certbot certonly --manual --preferred-challenges dns -d example.com -d *.example.com
# 2. Certbot shows the TXT record to create:
# "Please deploy a DNS TXT record under the name
# _acme-challenge.example.com with the following value:
# 9G8F7K3LmN2pQ1rS5tU8vW0x4yZ6..."
# 3. Add the TXT record in your DNS provider's control panel
# 4. Verify record is live:
dig TXT _acme-challenge.example.com +short
# 5. Press Enter in Certbot to continue validation
Supported DNS Providers
Certbot supports 50+ DNS providers through official plugins. Most popular:
| Provider | Certbot Plugin | Installation | Credential Setup |
|---|---|---|---|
| Cloudflare | certbot-dns-cloudflare |
snap install certbot-dns-cloudflare |
API token (scoped to zone) |
| AWS Route53 | certbot-dns-route53 |
snap install certbot-dns-route53 |
IAM role or AWS credentials |
| Google Cloud DNS | certbot-dns-google |
snap install certbot-dns-google |
Service account JSON |
| Azure DNS | certbot-dns-azure |
snap install certbot-dns-azure |
Managed identity or service principal |
| DigitalOcean | certbot-dns-digitalocean |
snap install certbot-dns-digitalocean |
API token |
| OVH | certbot-dns-ovh |
snap install certbot-dns-ovh |
Application credentials |
| GoDaddy | certbot-dns-godaddy |
snap install certbot-dns-godaddy |
API key + secret |
| Namecheap | acme.sh (use instead) |
Not officially supported by Certbot | API key |
Full list: Certbot DNS Plugins Documentation
Example: Cloudflare setup:
# 1. Install plugin
snap install certbot-dns-cloudflare
# 2. Create credentials file
mkdir -p ~/.secrets
chmod 700 ~/.secrets
cat > ~/.secrets/cloudflare.ini << EOF
dns_cloudflare_api_token = YOUR_CLOUDFLARE_API_TOKEN
EOF
chmod 600 ~/.secrets/cloudflare.ini
# 3. Issue certificate
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials ~/.secrets/cloudflare.ini \
-d example.com \
-d *.example.com
Example: Route53 setup (using IAM role):
# 1. Install plugin
snap install certbot-dns-route53
# 2. Configure IAM role with Route53 permissions (no credential file needed)
# 3. Issue certificate
certbot certonly \
--dns-route53 \
-d example.com \
-d *.example.com
Overview: Why DNS-01 Enables Advanced ACME Use Cases
DNS-01 challenge validation unlocks ACME capabilities that HTTP-01 cannot provide. While HTTP-01 requires publicly accessible web servers on port 80, DNS-01 proves domain ownership through DNS infrastructure—making it the only ACME challenge method that supports wildcard certificates (*.example.com) and works in air-gapped environments, behind corporate firewalls, and for services without web servers.
The DNS-01 advantage: Organizations operating in complex network environments—multi-cloud architectures, private networks, IoT deployments—need certificates for systems that cannot expose HTTP endpoints to the internet. DNS-01 moves the validation boundary from HTTP infrastructure to DNS infrastructure, which organizations already manage centrally.
Why This Belongs in ACME Client Operations
The ACME Protocol defines DNS-01 specification (RFC 8555 Section 8.4); this guide addresses DNS-01 operations. Understanding the protocol doesn't prepare you for:
- DNS provider integration: Each DNS provider (Cloudflare, Route53, Azure DNS, Google Cloud DNS) has different API authentication, rate limits, and propagation characteristics
- Propagation delays: DNS changes aren't instantaneous; validation timing must account for DNS TTL, nameserver propagation, and ACME CA polling intervals
- Multi-domain wildcards: Issuing certificates for
example.comAND*.example.comrequires careful DNS record coordination - Credential security: DNS API credentials grant zone modification powers—compromise enables domain hijacking, not just certificate issuance
- Split-horizon DNS: Internal and external DNS views complicate validation in enterprise environments
Real-world scenario: Your organization needs a wildcard certificate for *.internal.company.com to cover 50+ internal services. HTTP-01 won't work (internal domain, no public access). TLS-ALPN-01 won't work (requires TLS server on port 443 for each validation). DNS-01 is your only option—but your corporate DNS is managed by a separate team with strict change control procedures.
When to Use DNS-01
Use DNS-01 when you need:
- Wildcard certificates: *.example.com, *.api.example.com (only DNS-01 supports wildcards)
- Private network certificates: Internal services without internet access
- Non-HTTP services: Mail servers, VPN endpoints, IoT devices, APIs without web servers
- Multi-cloud deployments: Centralized certificate issuance regardless of infrastructure location
- Firewall-restricted environments: Systems behind NAT/firewalls that block HTTP-01 validation
Use HTTP-01 instead when: - You already have public web servers running - You don't need wildcard certificates - You want faster validation (no DNS propagation delay) - You want simpler automation (no DNS provider API integration)
Related Documentation
This page is part of the Operating ACME Clients section:
- Operating ACME Clients Overview - Section introduction and navigation
- X.509 Certificate Verification - Validating ACME-issued certificates
- Certbot Renewal Automation - Production renewal patterns
- DNS-01 Challenge Validation (this page) - DNS-based validation
- ACME Client Configuration (coming) - Multi-client configuration patterns
- Multi-Environment ACME (coming) - Development, staging, production setups
For DNS and infrastructure context: - Multi-Cloud PKI - Certificate management across cloud providers - Certificate Lifecycle Management - Complete lifecycle operations
For protocol understanding: - ACME Protocol - Protocol specification including challenge types - TLS Protocol - How certificates are used in TLS
Problem Statement
Challenge: Traditional HTTP-01 validation fails when:
- Servers operate behind firewalls/NAT without port 80/443 exposure to the internet
- Multiple servers share the same domain (load balancers, CDN origins)
- Wildcard certificates are required (*.example.com, *.api.example.com)
- Mail servers (SMTP, IMAP) or internal applications need certificates without running web servers
- Air-gapped or private network environments cannot receive HTTP-01 challenges
- Rate limiting makes per-server HTTP-01 validation impractical at scale
Solution: DNS-01 validation proves domain control through DNS infrastructure rather than HTTP endpoints. The ACME CA verifies you can create specific TXT records in your domain's DNS zone—demonstrating authoritative control over the domain.
Trade-offs: DNS-01 requires DNS provider API access (security consideration), tolerates DNS propagation delays (60-300 seconds typical), and demands careful credential management. For most organizations, these trade-offs are worthwhile for wildcard and private network use cases.
Architecture
Validation Flow
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ACME Client │────1───▶│ ACME CA │ │ DNS │
│ (Certbot) │ │ (Let's │ │ Provider │
└─────────────┘ │ Encrypt) │ │ (Cloudflare)│
│ └─────────────┘ └─────────────┘
│ │ ▲
│ │ │
2. Get TXT value │ │
│ │ │
▼ │ │
┌─────────────┐ │ │
│ DNS API │────3───────────┼────────────────────────┘
│ Integration │ Create TXT record
└─────────────┘ │
│
4. Verify TXT record
│
▼
┌─────────────┐
│ DNS Lookup │
│ dig TXT │
│ _acme-chal │
└─────────────┘
│
5. Certificate issued
│
▼
Certificate delivered
Flow Steps:
1. ACME client requests certificate, receives DNS-01 challenge
2. Client extracts TXT record value from challenge
3. Client uses DNS provider API to create _acme-challenge.example.com TXT "validation-token"
4. ACME CA queries DNS to verify TXT record exists
5. Upon successful verification, CA issues certificate
Components
ACME Client: - certbot with DNS plugins (dns-cloudflare, dns-route53, dns-azure, etc.) - acme.sh with 50+ DNS provider integrations - lego (Go-based ACME client with extensive DNS support) - dehydrated (Bash-based ACME client)
DNS Provider Requirements: - API for automated TXT record creation/deletion - Reasonable API rate limits (100+ requests/hour minimum) - Fast DNS propagation (< 300 seconds ideal) - Support for DNSSEC (optional but recommended)
Challenge Record Format:
Validation Window: - DNS TTL: Typically 60-300 seconds for challenge records - CA polling interval: Let's Encrypt checks every 5-10 seconds for up to 60 seconds - Total validation time: 1-5 minutes typical (DNS propagation + CA verification)
Implementation
Manual DNS-01 Challenge
Single Domain (Interactive)
# Basic manual challenge
sudo certbot certonly \
--manual \
--preferred-challenges dns \
-d example.com
Process:
1. Certbot displays: Please deploy a DNS TXT record under the name:
dig TXT _acme-challenge.example.com @8.8.8.8
4. Press Enter in Certbot after DNS propagation
5. Certificate issued to /etc/letsencrypt/live/example.com/
Wildcard Certificate (Multiple Domains)
# Wildcard + apex domain
sudo certbot certonly \
--manual \
--preferred-challenges dns \
-d example.com \
-d *.example.com \
-d www.example.com
Important: Wildcard certificates require separate TXT record for *.example.com
Automated DNS-01 with Provider Plugins
Cloudflare Integration
Installation
# Install Cloudflare DNS plugin
sudo apt update
sudo apt install python3-certbot-dns-cloudflare
# Or via pip
pip install certbot-dns-cloudflare
Credential Configuration
# Create credentials file
sudo mkdir -p /etc/letsencrypt/cloudflare
sudo tee /etc/letsencrypt/cloudflare/credentials.ini << 'EOF'
# Cloudflare API token (recommended)
dns_cloudflare_api_token = your-cloudflare-api-token-here
# Or legacy API key (less secure)
# dns_cloudflare_email = [email protected]
# dns_cloudflare_api_key = your-cloudflare-global-api-key
EOF
# Secure credentials
sudo chmod 600 /etc/letsencrypt/cloudflare/credentials.ini
sudo chown root:root /etc/letsencrypt/cloudflare/credentials.ini
Obtaining Cloudflare API Token (Scoped Permissions):
1. Cloudflare Dashboard → My Profile → API Tokens
2. Create Token → Edit Zone DNS template
3. Permissions: Zone:DNS:Edit for specific zones
4. Copy token (only shown once)
Certificate Issuance
# Single domain
sudo certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
-d example.com
# Wildcard certificate
sudo certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
--dns-cloudflare-propagation-seconds 30 \
-d example.com \
-d *.example.com
Propagation Tuning:
# Cloudflare DNS propagates quickly (15-30 seconds typical)
--dns-cloudflare-propagation-seconds 30
# For slower DNS providers, increase wait time
--dns-cloudflare-propagation-seconds 120
Route53 Integration (AWS)
Installation
IAM Policy for DNS-01
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:GetChange"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": "arn:aws:route53:::hostedzone/ZXXXXXXXXXXXXX"
}
]
}
Authentication Methods:
# Method 1: IAM instance profile (recommended for EC2)
sudo certbot certonly --dns-route53 -d example.com
# Method 2: AWS credentials file
export AWS_CONFIG_FILE=/etc/letsencrypt/aws/config
sudo certbot certonly --dns-route53 -d example.com
# Method 3: Environment variables
export AWS_ACCESS_KEY_ID=AKIAXXXXXXXX
export AWS_SECRET_ACCESS_KEY=xxxxx
sudo -E certbot certonly --dns-route53 -d example.com
Multi-Account Route53
# Specify profile for cross-account DNS
AWS_PROFILE=dns-account certbot certonly \
--dns-route53 \
-d example.com
Azure DNS Integration
Installation
Service Principal Setup
# Create service principal
az ad sp create-for-rbac \
--name certbot-dns-azure \
--role "DNS Zone Contributor" \
--scopes /subscriptions/SUBSCRIPTION_ID/resourceGroups/RG_NAME/providers/Microsoft.Network/dnszones/example.com
# Output provides:
# - appId (client_id)
# - password (client_secret)
# - tenant
Configuration
# /etc/letsencrypt/azure/credentials.ini
dns_azure_sp_client_id = xxxxx-xxxx-xxxx-xxxx-xxxxx
dns_azure_sp_client_secret = your-client-secret
dns_azure_tenant_id = xxxxx-xxxx-xxxx-xxxx-xxxxx
dns_azure_subscription_id = xxxxx-xxxx-xxxx-xxxx-xxxxx
dns_azure_resource_group = your-resource-group
Certificate Issuance
sudo certbot certonly \
--dns-azure \
--dns-azure-credentials /etc/letsencrypt/azure/credentials.ini \
-d example.com -d *.example.com
Google Cloud DNS Integration
Installation
Service Account Setup
# Create service account
gcloud iam service-accounts create certbot-dns \
--display-name "Certbot DNS-01 Challenge"
# Grant DNS admin role
gcloud projects add-iam-policy-binding PROJECT_ID \
--member serviceAccount:certbot-dns@PROJECT_ID.iam.gserviceaccount.com \
--role roles/dns.admin
# Create key file
gcloud iam service-accounts keys create /etc/letsencrypt/gcp/credentials.json \
--iam-account certbot-dns@PROJECT_ID.iam.gserviceaccount.com
Certificate Issuance
sudo certbot certonly \
--dns-google \
--dns-google-credentials /etc/letsencrypt/gcp/credentials.json \
-d example.com -d *.example.com
Generic DNS Provider (acme.sh)
acme.sh supports 50+ DNS providers with unified interface
# Install acme.sh
curl https://get.acme.sh | sh -s email=[email protected]
source ~/.acme.sh/acme.sh.env
# Example: Namecheap
export NAMECHEAP_USERNAME="your-username"
export NAMECHEAP_API_KEY="your-api-key"
export NAMECHEAP_SOURCEIP="your-server-ip"
acme.sh --issue --dns dns_namecheap \
-d example.com -d *.example.com
# Example: DigitalOcean
export DO_API_KEY="your-digitalocean-api-token"
acme.sh --issue --dns dns_dgon \
-d example.com -d *.example.com
# List all supported DNS providers
acme.sh --help | grep "dns_"
Enterprise Automation Pattern
Centralized Certificate Management
Multi-Domain Automation Script
#!/bin/bash
set -euo pipefail
# Enterprise DNS-01 automation script
DOMAINS_FILE="/etc/ssl-automation/domains.conf"
DNS_PROVIDER="cloudflare"
CERT_DIR="/etc/letsencrypt/live"
LOG_FILE="/var/log/certbot/dns01-automation.log"
exec > >(tee -a "$LOG_FILE") 2>&1
echo "=== DNS-01 Certificate Automation Started: $(date) ==="
# Read domains from configuration file
# Format: domain_name|cert_name|additional_sans
while IFS='|' read -r domain cert_name sans; do
echo "Processing: $domain"
# Build domain arguments
domain_args="-d $domain"
if [ -n "$sans" ]; then
for san in ${sans//,/ }; do
domain_args="$domain_args -d $san"
done
fi
# Issue/renew certificate
if certbot certonly \
--non-interactive \
--agree-tos \
--email [email protected] \
--dns-${DNS_PROVIDER} \
--dns-${DNS_PROVIDER}-credentials /etc/ssl-automation/dns-credentials.ini \
--dns-${DNS_PROVIDER}-propagation-seconds 60 \
$domain_args \
--cert-name "${cert_name}" \
--deploy-hook "/etc/ssl-automation/deploy-${cert_name}.sh"; then
echo "Success: $domain"
else
echo "Failed: $domain"
# Send alert
echo "Certificate issuance failed for $domain" | \
mail -s "DNS-01 Certificate Failure" [email protected]
fi
# Rate limit: space out requests
sleep 5
done < "$DOMAINS_FILE"
echo "=== DNS-01 Certificate Automation Completed: $(date) ==="
Domains Configuration (/etc/ssl-automation/domains.conf)
example.com|example-wildcard|*.example.com,www.example.com
api.company.com|api-wildcard|*.api.company.com
internal.corp.net|internal-services|*.internal.corp.net,vpn.internal.corp.net
Multi-Environment DNS Management
Terraform DNS Provider Integration
# Terraform configuration for DNS-01 prerequisites
provider "cloudflare" {
api_token = var.cloudflare_api_token
}
resource "cloudflare_zone" "example" {
zone = "example.com"
}
# API token for Certbot with limited scope
resource "cloudflare_api_token" "certbot_dns" {
name = "Certbot DNS-01 Challenge"
policy {
permission_groups = [
data.cloudflare_api_token_permission_groups.all.zone["DNS Write"],
]
resources = {
"com.cloudflare.api.account.zone.${cloudflare_zone.example.id}" = "*"
}
}
}
output "certbot_dns_token" {
value = cloudflare_api_token.certbot_dns.value
sensitive = true
}
Ansible Playbook for Multi-Server Deployment
---
- name: Issue certificates via DNS-01 across environments
hosts: certificate_servers
become: yes
vars:
ssl_certificates:
- name: production-wildcard
domain: "*.prod.example.com"
sans: "prod.example.com"
env: production
- name: staging-wildcard
domain: "*.staging.example.com"
sans: "staging.example.com"
env: staging
tasks:
- name: Install Certbot DNS plugin
apt:
name: python3-certbot-dns-cloudflare
state: present
- name: Deploy DNS credentials
template:
src: cloudflare-credentials.ini.j2
dest: /etc/letsencrypt/cloudflare/credentials.ini
mode: '0600'
owner: root
group: root
- name: Issue certificates
command: >
certbot certonly
--dns-cloudflare
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini
--dns-cloudflare-propagation-seconds 30
-d {{ item.domain }}
-d {{ item.sans }}
--cert-name {{ item.name }}
--non-interactive
--agree-tos
--email [email protected]
loop: "{{ ssl_certificates }}"
when: inventory_hostname == groups['certificate_servers'][0]
- name: Deploy certificates to application servers
synchronize:
src: "/etc/letsencrypt/live/{{ item.name }}/"
dest: "/etc/ssl/{{ item.name }}/"
mode: push
delegate_to: "{{ groups['certificate_servers'][0] }}"
loop: "{{ ssl_certificates }}"
Kubernetes cert-manager DNS-01
ClusterIssuer with DNS-01
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-dns01
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-dns01-account-key
solvers:
# Cloudflare DNS-01 solver
- dns01:
cloudflare:
email: [email protected]
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- 'example.com'
- '*.example.com'
# Route53 DNS-01 solver for different domain
- dns01:
route53:
region: us-east-1
hostedZoneID: Z1234567890ABC
selector:
dnsZones:
- 'aws-hosted.example.com'
Wildcard Certificate Resource
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-example-com
namespace: default
spec:
secretName: wildcard-example-com-tls
issuerRef:
name: letsencrypt-dns01
kind: ClusterIssuer
dnsNames:
- '*.example.com'
- 'example.com'
Common Pitfalls
1. DNS Propagation Timing Issues
Problem: ACME validation fails because DNS changes haven't propagated globally
# WRONG - insufficient propagation wait time
certbot certonly --dns-cloudflare \
--dns-cloudflare-propagation-seconds 10 # Too short!
# Error: Incorrect TXT record
Solution: Verify DNS propagation before validation
# Check DNS propagation across multiple nameservers
dig TXT _acme-challenge.example.com @8.8.8.8 # Google DNS
dig TXT _acme-challenge.example.com @1.1.1.1 # Cloudflare DNS
dig TXT _acme-challenge.example.com @208.67.222.222 # OpenDNS
# Use appropriate propagation delay for your DNS provider
# Cloudflare: 20-30 seconds
# Route53: 30-60 seconds
# Traditional DNS: 60-120 seconds
certbot certonly --dns-cloudflare \
--dns-cloudflare-propagation-seconds 60 # Safe default
DNS Propagation Check Script
#!/bin/bash
RECORD="_acme-challenge.example.com"
EXPECTED_VALUE="validation-token"
NAMESERVERS=("8.8.8.8" "1.1.1.1" "208.67.222.222")
for ns in "${NAMESERVERS[@]}"; do
result=$(dig +short TXT "$RECORD" @"$ns")
if [ "$result" == "\"$EXPECTED_VALUE\"" ]; then
echo "✓ Propagated to $ns"
else
echo "✗ Not yet on $ns (got: $result)"
fi
done
2. DNS Plugin Installation and Version Conflicts
Problem: Missing or incompatible DNS plugin versions
# Check installed plugins
certbot plugins
# Output: No DNS plugins found
# Common issue: Plugin not installed
sudo certbot certonly --dns-cloudflare ...
# Error: certbot: error: unrecognized arguments: --dns-cloudflare
Solution: Install and verify DNS plugins
# Install specific plugin
sudo apt install python3-certbot-dns-cloudflare
# Or via pip (for latest version)
pip install --upgrade pip
pip install certbot-dns-cloudflare --force-reinstall
# Verify plugin is available
certbot plugins | grep cloudflare
# Output: dns-cloudflare
# Check plugin version
pip show certbot-dns-cloudflare
Version Compatibility Matrix
3. DNS API Permission and Credential Issues
Problem: DNS API credentials lack necessary permissions
# Error messages indicating permission issues:
# - "Authentication failed"
# - "Forbidden: You do not have permission"
# - "Access denied to zone"
Solution: Verify and scope DNS API permissions correctly
Cloudflare API Token Permissions:
Required:
- Zone:DNS:Edit (for specific zones)
Optional but recommended:
- Zone:Zone:Read (to list zones)
AWS Route53 IAM Policy (Minimum permissions):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:GetChange",
"route53:ListHostedZones"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": "arn:aws:route53:::hostedzone/ZXXXXX"
}
]
}
Credential File Security:
# WRONG - world-readable credentials
chmod 644 /etc/letsencrypt/dns-credentials.ini
# Security risk: Any user can read DNS API credentials
# CORRECT - restricted permissions
chmod 600 /etc/letsencrypt/dns-credentials.ini
chown root:root /etc/letsencrypt/dns-credentials.ini
# Verify permissions
ls -la /etc/letsencrypt/dns-credentials.ini
# -rw------- 1 root root ... dns-credentials.ini
4. Rate Limiting from DNS Provider APIs
Problem: DNS provider API rate limits exceeded
# Cloudflare: 1200 requests/5 minutes per zone
# Route53: 5 API requests/second (steady state)
# Google Cloud DNS: 400 write requests/minute per project
Solution: Implement rate limiting and backoff
#!/bin/bash
# Rate-limited certificate issuance
DOMAINS=("site1.example.com" "site2.example.com" "site3.example.com")
DELAY_BETWEEN_REQUESTS=10 # seconds
for domain in "${DOMAINS[@]}"; do
echo "Issuing certificate for $domain"
certbot certonly --dns-cloudflare ... -d "$domain"
# Wait between requests to avoid rate limits
sleep $DELAY_BETWEEN_REQUESTS
done
5. Split-Horizon DNS Challenges
Problem: Internal DNS returns different TXT records than external DNS
Scenario: - Internal DNS: Used by internal applications - External DNS: Authoritative for internet - ACME CA validates against external DNS - Internal systems may see stale or different records
Solution: Ensure ACME challenge records propagate to external DNS
# Verify external DNS resolution
dig TXT _acme-challenge.example.com @8.8.8.8 # External resolver
# If using split-horizon, ensure challenge records exist in BOTH:
# 1. External zone (for ACME validation)
# 2. Internal zone (for consistency)
# Or configure internal DNS to forward _acme-challenge queries externally
6. Let's Encrypt Rate Limits
Problem: Exceeding Let's Encrypt rate limits during testing
Rate Limits (per domain per week):
- 50 certificates per registered domain
- 5 duplicate certificates (same exact SANs)
Solution: Use staging environment for testing
# WRONG - testing against production
for i in {1..10}; do
certbot certonly --dns-cloudflare -d test$i.example.com
done
# Risk: Approaching rate limit of 50 certs/week
# CORRECT - test with staging first
certbot certonly \
--staging \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
-d test.example.com
# Staging server URL (manual specification)
--server https://acme-staging-v02.api.letsencrypt.org/directory
# Once validated, switch to production
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
-d production.example.com
7. Wildcard + Apex Domain Confusion
Problem: Misunderstanding wildcard certificate scope
# Common misconception:
# "*.example.com" covers both "example.com" AND all subdomains
# Reality:
# "*.example.com" covers: foo.example.com, bar.example.com
# "*.example.com" does NOT cover: example.com (apex domain)
Solution: Explicitly request both wildcard and apex
# WRONG - apex domain not covered
certbot certonly --dns-cloudflare -d *.example.com
# CORRECT - explicitly include both
certbot certonly --dns-cloudflare \
-d example.com \ # Apex domain
-d *.example.com # Wildcard for all subdomains
# Also works for multiple levels
certbot certonly --dns-cloudflare \
-d api.example.com \
-d *.api.example.com # Covers foo.api.example.com but not api.example.com
Best Practices
1. Security Hardening
Principle of Least Privilege for DNS API Access
# WRONG - global API key with full account access
dns_cloudflare_api_key = your-global-api-key
# CORRECT - scoped API token for specific zones only
dns_cloudflare_api_token = token-with-dns-edit-for-example-com-only
Cloudflare Scoped Token:
1. Create token with ONLY Zone:DNS:Edit permission
2. Scope to specific zones: example.com, api.example.com
3. Set IP restrictions if possible (certificate server IPs)
4. Set expiration date (rotate annually)
AWS Route53 Resource-Based Policy:
Credential Rotation
# Rotate DNS API credentials quarterly
# 1. Generate new API token
# 2. Update credential files
# 3. Test certificate renewal
# 4. Revoke old token
# 5. Document rotation in audit log
Audit Trail for DNS Changes
# Log all DNS-01 operations
logger -t certbot-dns01 "Certificate requested for $DOMAIN by $USER"
# Enable DNS provider audit logging
# Cloudflare: Audit Logs in Dashboard
# Route53: CloudTrail for route53:ChangeResourceRecordSets
# Azure DNS: Activity Log for Microsoft.Network/dnszones/TXT/write
2. Automation Excellence
Robust Renewal Script with Error Handling
#!/bin/bash
set -euo pipefail
# Production DNS-01 renewal automation
LOG_FILE="/var/log/certbot/dns01-renewal-$(date +%Y%m%d).log"
ERROR_EMAIL="[email protected]"
SUCCESS_WEBHOOK="https://monitoring.example.com/webhook/cert-renewal"
exec > >(tee -a "$LOG_FILE") 2>&1
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*"
}
log "Starting DNS-01 certificate renewal"
# Pre-flight checks
if ! command -v certbot &> /dev/null; then
log "ERROR: certbot not found"
exit 1
fi
if ! certbot plugins | grep -q dns-cloudflare; then
log "ERROR: dns-cloudflare plugin not installed"
exit 1
fi
if [ ! -f /etc/letsencrypt/cloudflare/credentials.ini ]; then
log "ERROR: DNS credentials file missing"
exit 1
fi
# Attempt renewal
if certbot renew \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
--deploy-hook "/usr/local/bin/certificate-deploy.sh" \
--quiet; then
log "Renewal successful"
# Notify monitoring system
curl -X POST "$SUCCESS_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{\"status\":\"success\",\"timestamp\":\"$(date -Iseconds)\"}"
else
log "ERROR: Renewal failed"
# Send alert email
echo "DNS-01 certificate renewal failed. Check logs: $LOG_FILE" | \
mail -s "ALERT: Certificate Renewal Failed" "$ERROR_EMAIL"
# Page on-call engineer
curl -X POST "https://pagerduty.example.com/api/incidents" \
-H "Authorization: Token token=$PAGERDUTY_TOKEN" \
-d "{\"incident\":{\"type\":\"incident\",\"title\":\"DNS-01 renewal failure\"}}"
exit 1
fi
log "DNS-01 renewal completed successfully"
Idempotent Renewal Automation
# Design renewals to be idempotent (safe to run multiple times)
# Certbot automatically skips certificates >30 days from expiry
# Safe to run frequently
0 */12 * * * /usr/local/bin/certbot-dns01-renewal.sh
3. Monitoring and Observability
Certificate Expiration Monitoring
#!/bin/bash
# Monitor certificate expiration
DOMAINS=(
"example.com"
"*.example.com"
"api.example.com"
)
for domain in "${DOMAINS[@]}"; do
cert_path="/etc/letsencrypt/live/${domain/\*./wildcard.}/cert.pem"
if [ -f "$cert_path" ]; then
expiry=$(openssl x509 -in "$cert_path" -noout -enddate | cut -d= -f2)
expiry_epoch=$(date -d "$expiry" +%s)
now_epoch=$(date +%s)
days_left=$(( ($expiry_epoch - $now_epoch) / 86400 ))
echo "$domain: $days_left days until expiry"
if [ $days_left -lt 30 ]; then
echo "WARNING: $domain expires in $days_left days"
# Trigger alert
fi
else
echo "WARNING: Certificate not found for $domain"
fi
done
Prometheus Metrics Export
# Export certificate metrics for Prometheus
cat > /var/lib/node_exporter/textfile_collector/certificates.prom << EOF
# HELP ssl_certificate_expiry_days Days until SSL certificate expires
# TYPE ssl_certificate_expiry_days gauge
ssl_certificate_expiry_days{domain="example.com",type="dns01"} 89
ssl_certificate_expiry_days{domain="*.example.com",type="dns01"} 89
EOF
Grafana Dashboard Queries
# Alert when certificates expire within 7 days
ssl_certificate_expiry_days{type="dns01"} < 7
# Renewal success rate
rate(certbot_renewal_success_total[1h]) /
rate(certbot_renewal_attempts_total[1h])
4. DNS Provider Selection Criteria
Evaluation Matrix:
| Provider | API Quality | Propagation Speed | Rate Limits | Cost | Enterprise Features |
|---|---|---|---|---|---|
| Cloudflare | Excellent | 15-30s | 1200 req/5min | Free | DNSSEC, API tokens |
| Route53 | Excellent | 30-60s | 5 req/s | ~$0.50/zone/mo | IAM integration |
| Google DNS | Good | 30-60s | 400 req/min | $0.20/zone/mo | GCP integration |
| Azure DNS | Good | 60-120s | 500 req/5min | $0.50/zone/mo | AD integration |
| Traditional DNS | Varies | 120-300s | Varies | Varies | Often limited APIs |
Selection Criteria: 1. API Reliability: 99.9%+ uptime for API endpoints 2. Propagation Speed: < 60 seconds preferred for automated workflows 3. Rate Limits: Support for expected certificate volume 4. Security Features: API token scoping, audit logs, DNSSEC 5. Cost: Free tier availability for small deployments
Multi-Provider Strategy:
# Primary: Cloudflare (fast, reliable)
certbot certonly --dns-cloudflare -d primary.example.com
# Backup: Route53 (if Cloudflare unavailable)
certbot certonly --dns-route53 -d backup.example.com
# Document failover procedures in runbook
5. Production Deployment Patterns
1. Test in Staging Environment First
# Stage 1: Staging server with Let's Encrypt staging
certbot certonly \
--staging \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials-staging.ini \
-d staging.example.com
# Stage 2: Production server with Let's Encrypt staging (validate DNS setup)
certbot certonly \
--staging \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
-d production.example.com
# Stage 3: Production server with Let's Encrypt production
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare/credentials.ini \
-d production.example.com
2. Gradual Rollout to Non-Production First
# Week 1: Development environment
# Week 2: Staging environment
# Week 3: Production (canary - 10% of domains)
# Week 4: Production (full rollout)
3. Rollback Plan
#!/bin/bash
# Certificate rollback procedure
DOMAIN="$1"
BACKUP_DIR="/etc/letsencrypt/backup-$(date +%Y%m%d)"
# Backup current certificate before renewal
cp -r /etc/letsencrypt/live/$DOMAIN $BACKUP_DIR/
# If renewal fails or causes issues, restore:
# cp -r $BACKUP_DIR/$DOMAIN /etc/letsencrypt/live/
# systemctl reload nginx
4. Documentation and Runbooks
# DNS-01 Renewal Runbook
## Normal Operations
- Automated renewal via cron (daily at 2 AM)
- DNS-01 challenge via Cloudflare API
- Certificates deployed to /etc/letsencrypt/live/
- Services auto-reloaded via deploy hooks
## Manual Renewal (Emergency)
1. SSH to cert-server.example.com
2. Run: sudo /usr/local/bin/certbot-dns01-manual.sh example.com
3. Verify DNS propagation: dig TXT _acme-challenge.example.com @8.8.8.8
4. Validate certificate: openssl x509 -in /etc/letsencrypt/live/example.com/cert.pem -noout -text
## Troubleshooting
### DNS Propagation Issues
- Check Cloudflare DNS dashboard for TXT records
- Query multiple nameservers (8.8.8.8, 1.1.1.1, 208.67.222.222)
- Increase --dns-cloudflare-propagation-seconds to 120
### API Authentication Failures
- Verify API token in /etc/letsencrypt/cloudflare/credentials.ini
- Check token permissions in Cloudflare dashboard
- Rotate token if compromised
### Rate Limit Exceeded
- Wait 1 week for rate limit reset
- Use staging server for testing
- Review automation frequency
## Escalation
- Primary: [email protected]
- Secondary: Platform team Slack channel
- PagerDuty: DNS-01 renewal failure alerts
Operational Checklist
Before deploying DNS-01 automation to production:
- [ ] Select DNS provider with reliable API and fast propagation
- [ ] Create scoped API credentials (minimum required permissions)
- [ ] Install appropriate Certbot DNS plugin or acme.sh
- [ ] Test manual DNS-01 challenge with single domain
- [ ] Verify DNS propagation across multiple nameservers
- [ ] Test wildcard certificate issuance (if needed)
- [ ] Configure appropriate propagation delay for DNS provider
- [ ] Secure credential files (chmod 600, root ownership)
- [ ] Implement automated renewal with error handling
- [ ] Set up monitoring for certificate expiration
- [ ] Document DNS provider API limits and behavior
- [ ] Create rollback procedures for failed renewals
- [ ] Test renewal in staging environment first
- [ ] Configure alerting for renewal failures
- [ ] Document emergency manual renewal procedures
- [ ] Implement credential rotation schedule
- [ ] Enable DNS provider audit logging
- [ ] Test rate limiting behavior
- [ ] Validate certificates after issuance
- [ ] Update runbook with DNS-01 specific troubleshooting
Related Documentation
ACME Operations: - Operating ACME Clients Overview - Section navigation - X.509 Certificate Verification - Certificate validation - Certbot Renewal Automation - Renewal patterns - ACME Challenge Validation (coming) - HTTP-01, TLS-ALPN-01 alternatives
Broader Operations: - Certificate Lifecycle Management - Complete lifecycle - Renewal Automation - Platform-agnostic strategies - Monitoring and Alerting - Monitoring frameworks
Implementation: - Multi-Cloud PKI - DNS-01 in multi-cloud environments - ACME Protocol Implementation - Building ACME servers
Protocol: - ACME Protocol - Protocol specification (RFC 8555) - TLS Protocol - TLS and certificate usage
Troubleshooting: - Common Misconfigurations - Configuration issues - Chain Validation Errors - Certificate chain problems
This comprehensive guide provides production-ready DNS-01 challenge implementation patterns for wildcard certificates, private networks, and enterprise automation scenarios across diverse DNS providers and infrastructure environments.