Website Monitoring Checklist

How to Use This Checklist

This is a comprehensive, actionable checklist for setting up monitoring that covers every dimension of your website's health. Work through it section by section. Each item is something you should configure, verify, or document.

Not every item applies to every site. A personal blog does not need vendor dependency monitoring. A SaaS product handling payments needs every item on this list. Use judgment, but err on the side of monitoring too much rather than too little. The cost of one missed outage exceeds the cost of an extra monitoring check by orders of magnitude.

The checklist is organized from most critical (uptime) to supplementary (performance and security headers). If you can only set up one section today, start with uptime. Then come back for the rest. For a broader perspective, see our Website Maintenance and Monitoring Guide.

Uptime Monitoring

Uptime is the foundation. If your site is not reachable, nothing else matters.

Monitor Your Primary Domain

Add your main website URL (e.g., https://www.example.com) as a monitored target. Use HTTPS, not HTTP, so the check also verifies SSL connectivity. Set the check interval to 30-60 seconds for production sites.

Monitor Application Endpoints

Add your login page, dashboard, or primary application URL. The homepage is often served from cache and stays up while the application server is down. Monitoring an authenticated endpoint catches application-level failures that homepage monitoring misses.

Monitor API Endpoints

If you have a public or internal API, monitor its health endpoint (e.g., /api/health or /api/v1/status). Configure the check to validate the response body, not just the status code. A 200 response with an error message is still a failure.

Monitor Critical User Flows

Identify the pages that generate revenue or are essential to your product: checkout pages, signup forms, payment confirmation pages. Add each as a separate monitored target with keyword validation to ensure the page content is correct.

Monitor Subdomains

Add monitoring for every active subdomain: blog.example.com, docs.example.com, app.example.com, api.example.com, staging.example.com. Each runs on different infrastructure and can fail independently.

Enable Multi-Location Checks

Configure uptime checks from at least three geographic locations. This distinguishes between global outages and regional network issues, and prevents false positives from transient routing problems at a single monitoring location.

Set Response Time Thresholds

Define acceptable response times for each target. Web pages should respond within 2-5 seconds. API endpoints should respond within 500ms-2 seconds. Alert when response times consistently exceed these thresholds, even if the site is technically up.

Configure Confirmation Checks

Require at least two consecutive failures from multiple locations before triggering an alert. This eliminates false positives from momentary network blips without significantly delaying real outage detection.

Start with your revenue-critical pages. A monitored checkout page is more valuable than a monitored blog post. You can always expand coverage later, but critical paths should be monitored from day one.

SSL Certificate Monitoring

An expired SSL certificate is functionally equivalent to a complete site outage. Browsers will block access entirely.

Monitor Certificate Expiry for All Domains

Add every domain and subdomain to SSL monitoring. Each may have a different certificate with a different expiration date. Wildcard certificates cover subdomains but not the apex domain, so monitor both.

Set Multi-Stage Expiry Alerts

Configure alerts at 30 days, 14 days, and 7 days before certificate expiration. The 30-day alert gives you time to investigate calmly. The 7-day alert is your emergency backstop.

Verify Certificate Chain Validity

Ensure your monitoring tool checks the full certificate chain, not just the leaf certificate. Missing intermediate certificates cause failures on mobile browsers and API clients while appearing fine in desktop browsers.

Document Your Renewal Process

For each certificate, record how it is provisioned and renewed: Let's Encrypt via Certbot, paid certificate via registrar, CDN-managed certificate, or cloud provider auto-provisioned. When an alert fires, the responder needs to know where to look.

Verify After Every Server Change

After any server migration, web server configuration change, or certificate renewal, verify that the correct certificate is being served with a complete chain. Add this to your deployment checklist.

One Dashboard for Your Entire Checklist

Site Watcher monitors uptime, SSL, domain expiry, DNS, and vendor dependencies from one place. $39/mo unlimited. Free for up to 3 targets.

Domain Expiry Monitoring

Losing your domain is catastrophic and often irreversible. Auto-renewal is a convenience, not a guarantee.

List All Owned Domains

Audit every domain your organization owns. Include primary domains, regional variants (.co.uk, .de), defensive registrations (common misspellings), legacy domains, and domains used for internal services or email.

Enable Expiry Monitoring for Each Domain

Add every domain to expiry monitoring. Configure alerts at 90, 60, 30, 14, and 7 days before expiration. The 90-day alert covers you even if a domain has a short registration period remaining.

Verify Auto-Renew Status

For each domain, log into the registrar and confirm auto-renewal is enabled, the payment method is current, and the contact email is monitored by an active team member. Document this verification.

Consolidate Registrar Access

Ensure at least two team members have access to every registrar account. Document account credentials in your team's password manager. A domain managed by a single person's personal account is a ticking time bomb.

Enable Registrar Lock

For every domain, enable the registrar lock (clientTransferProhibited). This prevents unauthorized domain transfers. Verify the lock status is part of your domain monitoring.

DNS Record Monitoring

DNS is the invisible layer that everything depends on. Unauthorized or accidental changes can redirect traffic, break email, and enable phishing.

Baseline All DNS Records

Export a complete snapshot of your DNS zone for each domain. Review every record to confirm it is correct and intentional. This snapshot becomes the reference point that monitoring compares against.

Monitor A and AAAA Records

Track the IP addresses your domains resolve to. Any change means your traffic is going to a different server. This is the most direct indicator of DNS hijacking.

Monitor CNAME Records

Track all CNAME aliases, especially for subdomains pointing to third-party services (CDNs, hosting platforms, SaaS tools). Watch for dangling CNAMEs that could enable subdomain takeover attacks.

Monitor MX Records

Track your email routing records. Unauthorized MX changes redirect your email to an attacker's server. This is one of the highest-impact DNS attacks because it is silent and provides access to sensitive communications.

Monitor TXT Records

Track SPF, DKIM, and DMARC records. These control email authentication and domain verification. A weakened DMARC policy or altered SPF record enables phishing attacks using your domain.

Monitor NS Records

Track your nameserver delegation. An NS record change hands control of your entire DNS to different nameservers. This should be exceedingly rare. Any unexpected NS change is a critical security event.

Remove Stale Records

Identify and remove DNS records that point to decommissioned services, old IP addresses, or unused subdomains. Stale records are attack vectors. If a CNAME points to a service you no longer use, remove it.

Vendor and Dependency Monitoring

Your site depends on services you do not control. When they go down, your functionality breaks even if your infrastructure is perfect.

Identify All Third-Party Dependencies

List every external service your site depends on: CDN (Cloudflare, Fastly, AWS CloudFront), payment processor (Stripe, PayPal), email service (SendGrid, Mailgun, Postmark), analytics (Google Analytics, Mixpanel), authentication (Auth0, Okta), database (PlanetScale, Supabase, MongoDB Atlas), and hosting (Vercel, Netlify, AWS, GCP).

Monitor Status Pages

Add each vendor's status page to your monitoring. Most major services publish status at status.vendor.com. When they report an incident, you need to know immediately to assess impact on your users.

Monitor Vendor API Endpoints

For critical integrations, monitor the vendor's API health endpoint directly. Status pages are sometimes updated slowly. A direct health check detects issues faster than waiting for the vendor to acknowledge them.

Document Impact Mapping

For each vendor, document what user-facing functionality breaks when they go down. Stripe outage = checkout broken. SendGrid outage = transactional emails delayed. Cloudflare outage = entire site unreachable. This mapping speeds up incident response.

Establish Fallback Plans

For the most critical dependencies, document fallback procedures. Can you bypass the CDN and serve directly? Can you switch to a backup payment processor? Can you queue emails for retry? Having a plan before a vendor outage saves critical minutes during an incident.

You are only as reliable as your least reliable dependency. If your site has 99.99% uptime but your payment processor has 99.9%, your effective checkout availability is 99.9% at best.

Redirect Health

Misconfigured redirects silently break user experience and SEO without triggering traditional uptime alerts.

Verify HTTP to HTTPS Redirect

Confirm that http://yourdomain.com redirects to https://yourdomain.com with a 301 (permanent) redirect. Test both the apex domain and www subdomain. A missing redirect means some users access your site without encryption.

Verify WWW Canonicalization

Confirm that your chosen canonical URL (www or non-www) is enforced via 301 redirect. Both versions should work, with one redirecting to the other. Inconsistent behavior splits your SEO equity.

Check for Redirect Chains

Identify URLs that redirect through multiple hops before reaching the final destination. Each hop adds latency and increases the chance of a broken link. Chains of three or more redirects should be consolidated to a single redirect.

Monitor Key Landing Page Redirects

If you have redirects from old URLs to new ones (after a site restructure or rebrand), monitor that these redirects continue to work. A deployment that accidentally removes redirect rules can break hundreds of inbound links overnight.

Security Headers

Security headers protect your users from cross-site scripting, clickjacking, and other client-side attacks. They are configured once but should be monitored continuously.

Verify Content-Security-Policy

Check that your CSP header is present and correctly configured. A missing or overly permissive CSP leaves your users vulnerable to XSS attacks. A CSP that is too restrictive breaks your own site's functionality.

Verify Strict-Transport-Security

Check that your HSTS header is present with an appropriate max-age (at least 31536000 seconds / one year). Include the includeSubDomains directive if all subdomains use HTTPS. This prevents SSL stripping attacks.

Verify X-Frame-Options or CSP frame-ancestors

Ensure your site cannot be embedded in iframes on other domains unless intentionally allowed. This prevents clickjacking attacks where an attacker overlays your site with invisible elements.

Verify X-Content-Type-Options

Check that this header is set to "nosniff" to prevent browsers from MIME-type sniffing responses, which can be exploited to execute malicious content.

Monitor for Header Regressions

After deployments, verify that security headers have not been accidentally removed or weakened. A common pattern is a new web server configuration or CDN rule that strips custom headers without the team noticing.

Performance Baseline

While not traditional monitoring, establishing performance baselines helps you detect degradation before it causes user impact.

Record Baseline Response Times

Document the typical response time for your key pages under normal load. When monitoring detects a sustained increase above this baseline, investigate before it becomes a user-facing issue.

Monitor Time to First Byte (TTFB)

TTFB measures server processing time, independent of page size or client-side rendering. A TTFB increase indicates server-side performance degradation: database slowdowns, inefficient queries, or resource contention.

Set Response Time Alerts

Configure alerts when response times exceed 2x your baseline for more than 5 consecutive checks. This catches performance degradation that has not yet caused a full outage but is affecting user experience.

Alert Configuration

Monitoring without proper alerting is just logging. Alerts need to reach the right people through the right channels at the right urgency.

Configure Multiple Alert Channels

Set up at least two channels: email plus a real-time channel (Slack, Microsoft Teams, or SMS). Email alone is not fast enough for production outages. Real-time channels alone risk being missed if the person is away from their desk.

Define Alert Routing

Route alerts to the people who can act on them. Uptime alerts go to the on-call engineer. SSL alerts go to the infrastructure team. Domain alerts go to the team that manages registrar accounts. Sending all alerts to everyone creates noise.

Set Up Escalation

If the primary responder does not acknowledge an alert within 15 minutes, escalate to a secondary contact. If no one responds within 30 minutes, escalate to management. Alerts that go unacknowledged are alerts that do not exist.

Tune for Signal, Not Noise

After your first week of monitoring, review all alerts. Eliminate false positives by adjusting thresholds and confirmation requirements. Every alert should require action. If you are dismissing alerts habitually, your thresholds are wrong.

Test Your Alert Pipeline

Deliberately trigger a test alert through each channel to verify it reaches the right people. An alert pipeline you have never tested is an alert pipeline you cannot trust during a real incident.

Documentation and Runbooks

Monitoring tells you something is wrong. Documentation tells you how to fix it.

Create an Incident Response Runbook

Document the step-by-step process for responding to each alert type: who to contact, where to look, what to check first, and how to escalate. A 3 AM outage is not the time to figure out who has server access.

Document All Monitored Targets

Maintain a list of everything you monitor, why it is monitored, and what team owns it. When a new team member joins, they should be able to understand your monitoring setup in 15 minutes.

Schedule Quarterly Reviews

Set a recurring calendar event to review your monitoring setup every quarter. Add new targets for new services. Remove monitoring for decommissioned services. Update alert routing for team changes. Monitoring that does not evolve with your infrastructure becomes stale.

Quick-Reference Summary

Category	What to Monitor	Check Frequency	Alert Urgency
Uptime	Primary domain, app endpoints, API health	Every 30-60 seconds	Critical: immediate
SSL	Certificate expiry, chain validity	Daily	High: 30/14/7 day warnings
Domain	Registration expiry, registrar lock	Daily	High: 90/60/30/14/7 day warnings
DNS	A, CNAME, MX, TXT, NS records	Every 5-15 minutes	Critical: any change
Vendors	Status pages, API health endpoints	Every 1-5 minutes	High: when vendor reports incident
Redirects	HTTP→HTTPS, WWW canonical, key redirects	Every 5-15 minutes	Medium: on failure
Security Headers	CSP, HSTS, X-Frame-Options	Daily	Medium: on removal or change
Performance	Response time, TTFB	Every 1-5 minutes	Medium: on sustained degradation

For related guidance, see what is website monitoring, uptime monitoring explained, SSL certificate monitoring guide, and DNS monitoring explained. Forrester Research reports that proactive monitoring reduces mean time to detection by up to 70%. For specialized monitoring needs, see uptime monitoring and vendor status monitoring.

A monitoring checklist is only useful if you work through it. Pick the sections relevant to your site, set up the checks, and do not stop until you can answer the question: "If anything breaks right now, will I know within two minutes?"

Check Every Box from One Dashboard

Site Watcher monitors uptime, SSL, domain expiry, DNS, and vendor dependencies with one tool. $39/mo unlimited. Free for up to 3 targets.