Done correctly, an SSL renewal looks like nothing. Connections continue. Long-polling clients don't drop. The new certificate quietly takes over while the old one is still valid, and visitors never know it happened. Done incorrectly, you get either an expired-cert outage on day 91, or a 30-second window where every new connection fails because the server hasn't reloaded its config.
This guide covers the real techniques: how to automate, how to reload without dropping connections, how to keep an overlap window so you have time to roll back, and how to verify the new cert actually took.
The Three Failure Modes
Before fixing anything, it helps to know what can break:
- Forgotten renewal. No automation, the human responsible left, calendar reminder didn't fire, certificate expires. The most common mode and the easiest to prevent.
- Issuance failure. Automation runs, but the CA challenge fails — DNS provider changed APIs, port 80 got firewalled, the ACME client broke during a server update. Cert isn't renewed and you don't notice unless you have monitoring.
- Bad reload. New cert was issued and saved correctly, but the running web server is still using the old cert in memory. The fix is a reload — but a hard restart drops connections. Without a graceful reload, you get a brief window of failed connections.
The safe-renewal strategy below addresses all three.
Step 1: Automate the Issuance
The single biggest reduction in downtime risk is moving from manual renewal to automated. Manual renewals are forgotten; automated renewals fail loudly and predictably.
For Let's Encrypt: install Certbot (or acme.sh, or your platform's equivalent) and let it manage renewal on a timer. Certbot's default schedule renews 30 days before expiry, twice per day, with backoff and retry. If a renewal fails on Tuesday, it tries again Wednesday — by the time the cert is genuinely close to expiring, you've had 30 attempts.
For paid CAs: many now support ACME or have automation tools. Check your CA's docs — DigiCert, Sectigo, GlobalSign, and Entrust all have automation options today. If your CA doesn't, switch CAs or wrap manual renewal with calendar reminders at 60, 30, 14, and 7 days before expiry.
Either way, after every successful renewal, the script should run a quick sanity check — a simple openssl x509 -checkend against the new cert, or a curl against SSL Checker's endpoint, to confirm the live site is serving the new certificate.
Step 2: Use Graceful Reloads, Not Restarts
Modern web servers can swap to a new certificate without dropping in-flight connections. The technique is called a graceful reload.
nginx: sudo nginx -s reload or sudo systemctl reload nginx. nginx forks new worker processes that pick up the new config, while the old workers finish handling their existing connections and then exit. No connection is dropped.
Apache: sudo apachectl graceful or sudo systemctl reload apache2. Apache's mod_mpm_event or mod_mpm_worker handle this similarly — old connections are served by old children, new connections by new children.
HAProxy: uses a soft reload via the -sf flag, signalling old workers to drain. Most distros' systemd unit files use this automatically on reload.
Caddy: reloads automatically. Caddy's ACME handling is built in — when it issues a new cert, it swaps it into memory without restarting.
Avoid systemctl restart for renewal. Restart kills the process and starts fresh, dropping all in-flight connections. Reload is what you want, every time.
If your ACME client doesn't trigger a reload after issuance, add it as a deploy hook. Certbot has a --deploy-hook flag that runs a command after every successful renewal:
sudo certbot renew --deploy-hook "systemctl reload nginx"
Step 3: Keep an Overlap Window
Issue the new certificate well before the old one expires. Let's Encrypt with Certbot defaults to renewing at 30 days before expiry — that's a 30-day window where both the old and new cert technically exist (the old in your previous backups and any caches that captured it, the new on the live server). If something goes wrong with the new cert, you have a long runway to fix it.
Don't wait until day 89 of a 90-day cert to renew. Some teams renew at 60 days, leaving themselves an enormous safety margin. The cost is one more issuance per year; the benefit is sleeping through any single failed renewal attempt.
Step 4: Verify After Every Renewal
An automated renewal is only useful if you know it actually worked. Three checks:
- Filesystem check: the renewed cert file has a recent modification time, and
openssl x509 -in cert.pem -noout -datesshows a futurenotAfterdate. - Process check: the running web server has actually loaded the new cert.
echo | openssl s_client -connect example.com:443 -servername example.com 2>/dev/null | openssl x509 -noout -dateshits the live socket and shows the dates of the cert currently being served. If this still shows the old expiry, the reload didn't take. - Browser-perspective check: run SSL Checker against the domain. It connects exactly the way a browser would and confirms grade, expiry, chain, and protocols. If your monitoring only checks the filesystem, this catches the case where the cert is on disk but not in use.
Bake all three into the renewal script. Fail loudly if any check disagrees. A renewal that "succeeded" but isn't being served is worse than one that failed visibly — you find out two months later when the cert finally does expire.
Step 5: Plan a Rollback
You should have a way back if a renewal goes badly. Two simple approaches:
- Keep the previous cert and key. Don't overwrite the existing files; symlink them.
cert.pem→cert-2026-04.pem. If the new cert misbehaves, repoint the symlink and reload. Certbot's directory layout already does this —/etc/letsencrypt/archive/keeps every previous cert, and/etc/letsencrypt/live/contains symlinks to the current ones. - Snapshot before renewal. Some hosts (particularly cloud platforms) make it easy to snapshot a server before any change. For high-stakes renewals, take a snapshot, run the renewal, verify, and only delete the snapshot once everything's healthy.
The new cert is technically valid for the full 90 days from issuance — so even if you roll back to the old one, you can re-attempt the new install at any time before the old one expires.
Cluster and Load Balancer Considerations
Single-server setups are easy. Clusters need more care.
If you have multiple web servers behind a load balancer, the certificate needs to be on every backend (or terminated at the LB itself). Common patterns:
- Terminate TLS at the load balancer. Only the LB has the cert. Backend servers receive plaintext (over a private network) and only need to handle HTTP. Renewal happens once at the LB. This is the simplest model and the most common in modern cloud setups.
- Distribute via configuration management. Renew on one node, push the new cert to the rest via Ansible, Puppet, or a similar tool, then trigger a graceful reload on each. Works but has more moving parts.
- Shared filesystem. All nodes mount the same NFS or object storage volume containing the cert. One renewal updates everyone — but inotify-based reloads can be flaky on network filesystems.
- Built-in clustering. Caddy and Traefik in cluster mode coordinate renewal among themselves — only one node performs the ACME issuance, then the cert is shared via the cluster's storage backend.
Whichever pattern you use, after renewal verify that every backend is serving the new cert, not just the one that performed the renewal. Run SSL Checker a couple of times over a few minutes — if the load balancer round-robins between backends, repeated checks will hit different ones.
OCSP Stapling and Renewal
If you have OCSP stapling enabled, your server periodically fetches a "status proof" from the CA showing the certificate hasn't been revoked, and includes it in the TLS handshake. Stapled responses are typically valid for a few days.
After a renewal, the new certificate has a new OCSP responder URL embedded — your server should fetch a fresh OCSP response immediately. nginx 1.3.7+ does this automatically when the cert is reloaded; older versions may serve a stale OCSP for a few hours, which can cause SSL errors on browsers that fail-hard on OCSP. Modern nginx (1.18+) handles this well; if you're on something older, consider upgrading.
The 90-Day Question
Let's Encrypt certificates are valid for 90 days. Why so short? Because short lifetimes force automation, and automation eliminates the human error that causes the vast majority of certificate outages. Browsers have been quietly pushing the same direction for paid certs too — maximum lifetimes were cut from three years to two years to one year over the past decade, and there are active proposals to drop further.
The implication for your setup: if your renewal process requires a human to do anything, you'll eventually fail at it. The right answer isn't to renew less often (you can't, the CA won't issue longer certs anymore) — it's to remove humans from the routine path entirely, and have them step in only when monitoring tells them to.
The Practical Checklist
- Renewal is automated. (Certbot timer running, or equivalent.)
- Renewal runs at least 30 days before expiry, ideally 60.
- A successful renewal triggers a graceful reload of the web server.
- A post-renewal script verifies via the live socket and external tools that the new cert is being served.
- Failures alert someone. Email, Slack, monitoring tool — anything that gets a human's attention before day 89.
- The previous cert is kept on disk for at least one cert cycle as a rollback option.
- Every cert in your portfolio is monitored externally as well as locally — a periodic SSL Checker scan catches what your local monitoring misses.
Most certificate outages aren't caused by the technology being hard. They're caused by a process gap — automation that wasn't set up, a renewal that wasn't verified, an alert that wasn't received. Fixing the process is cheaper than fixing the outage.