📑 Table of Contents
The Impact of 404 Errors
404 errors hurt your website in multiple ways:
- Lost traffic: Visitors hitting 404s often leave immediately
- Wasted crawl budget: Googlebot wastes time on dead pages
- Lost link equity: Backlinks to 404 pages pass no value
- Poor user experience: Frustrates visitors and hurts trust
- Lower rankings: Too many 404s can signal poor site quality
🔑 Priority Rule: Fix 404s that receive the most traffic and have the most backlinks first. Not all 404s are equal.
Types of 404 Errors
Real 404s (Legitimate)
Content that was deleted intentionally:
- Discontinued products
- Expired events
- Removed blog posts
- Old campaigns
Soft 404s (Problematic)
Pages that return 200 OK but show "not found" content. Google hates these because they waste crawl budget.
Broken 404s (Most Damaging)
Content that should exist but doesn't:
- Typos in internal links
- Changed URLs without redirects
- Accidentally deleted pages
- Migration errors
Finding 404s in Server Logs
Basic 404 Discovery
# Find all 404 errors
awk '$9 == 404' access.log
# Count 404s by URL
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -50
# 404s with referrer (find broken internal links)
awk '$9 == 404' access.log | awk -F'"' '{print $4, "->", $2}' | grep -v "^-"
Find 404s Hit by Googlebot
# Googlebot 404s (priority - affects SEO)
grep "Googlebot" access.log | awk '$9 == 404 {print $7}' | sort | uniq -c | sort -rn
# Googlebot 404s over time (trending)
grep "Googlebot" access.log | awk '$9 == 404 {print $4}' | cut -d: -f1 | sort | uniq -c
Find Referrers (Source of Broken Links)
# Where are 404 clicks coming from?
awk '$9 == 404' access.log | awk -F'"' '{print $4}' | sort | uniq -c | sort -rn | head -30
# Internal vs External referrers
awk '$9 == 404' access.log | awk -F'"' '{print $4}' | grep "yourdomain.com" | sort | uniq -c
Prioritizing What to Fix
Priority Matrix
- High traffic + backlinks: Fix immediately with 301 redirect
- High traffic, no backlinks: Redirect or recreate content
- Low traffic + backlinks: Redirect to relevant page
- Low traffic, no backlinks: Leave as 404 or redirect to category
Calculating Impact
# Combine with traffic data
# Most hit 404s = highest priority
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -20
# Check if important pages link to 404s
# (high referring page authority = high priority)
💡 Pro Tip: Use LogBeast to automatically prioritize 404s by traffic volume, Googlebot hits, and referrer importance. Get a fix-first list without manual analysis.
How to Fix 404 Errors
1. Redirect (Most Common)
# .htaccess (Apache)
Redirect 301 /old-page /new-page
# Or with pattern matching
RedirectMatch 301 ^/products/old-(.*)$ /products/new-$1
# nginx
location = /old-page {
return 301 /new-page;
}
# Or in server block
rewrite ^/old-page$ /new-page permanent;
2. Restore Content
If the page was deleted accidentally:
- Restore from backup
- Check Wayback Machine for content
- Recreate if valuable
3. Update Internal Links
# Find pages linking to 404
grep "/old-broken-url" access.log | awk -F'"' '{print $4}' | sort | uniq
# Then update those pages to use correct URLs
4. Custom 404 Page
For legitimate 404s, create a helpful error page:
- Clear message that page doesn't exist
- Search box
- Links to popular pages
- Contact option
Preventing Future 404s
Before Deleting Content
- Check for incoming traffic in analytics
- Check for backlinks (Ahrefs, Search Console)
- Set up redirect before deletion
- Update internal links
During Site Migrations
- Map all old URLs to new URLs
- Set up 301 redirects
- Test thoroughly before launch
- Monitor 404s after migration
Ongoing Monitoring
#!/bin/bash
# Weekly 404 report
echo "=== 404 Report ==="
echo "Total 404s this week:"
awk '$9 == 404' access.log | wc -l
echo ""
echo "Top 404 URLs:"
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -20
echo ""
echo "Googlebot 404s:"
grep "Googlebot" access.log | awk '$9 == 404 {print $7}' | sort | uniq -c | sort -rn | head -10
🎯 Recommendation: Set up weekly 404 monitoring. Catching broken links early prevents traffic loss and maintains link equity. LogBeast includes automated 404 alerts and prioritized fix lists.