LogBeast CrawlBeast Consulting Blog Glossary Download Free

Complete Guide to Server Log Analysis

Master server log analysis from beginner to advanced. Learn Apache and Nginx formats, essential commands, and extract actionable insights.

📚
✨ Summarize with AI

Introduction to Server Logs

Server logs are the black box of your website. Every request, every error, every visitor - it's all recorded. Learning to read logs is like learning to read your website's diary.

There are two main types of logs you'll work with:

📍 Common Log Locations:
Apache: /var/log/apache2/access.log or /var/log/httpd/access_log
Nginx: /var/log/nginx/access.log

Log Formats Explained

Apache Combined Log Format

The most common format, includes referrer and user-agent:

192.168.1.1 - - [10/Jan/2025:13:55:36 +0000] "GET /page.html HTTP/1.1" 200 2326 "https://google.com" "Mozilla/5.0..."
FieldExampleMeaning
IP Address192.168.1.1Client's IP
Identity-RFC 1413 identity (usually -)
User-HTTP auth user (usually -)
Timestamp[10/Jan/2025:13:55:36 +0000]Request time
Request"GET /page.html HTTP/1.1"Method, URL, protocol
Status200HTTP response code
Size2326Response size in bytes
Referrer"https://google.com"Where they came from
User-Agent"Mozilla/5.0..."Browser/bot identifier

Nginx Default Format

$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"

Nearly identical to Apache Combined, making analysis techniques transferable.

Apache Log Analysis

Configuring Apache Logs

# In apache2.conf or httpd.conf
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common

# Enable in virtual host
CustomLog ${APACHE_LOG_DIR}/access.log combined

Adding Response Time

# Add %D for microseconds or %T for seconds
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D" combined_with_time

Nginx Log Analysis

Configuring Nginx Logs

# In nginx.conf
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" $request_time';

access_log /var/log/nginx/access.log main;

JSON Format (easier to parse)

log_format json_combined escape=json '{"time":"$time_iso8601",'
    '"ip":"$remote_addr",'
    '"method":"$request_method",'
    '"uri":"$request_uri",'
    '"status":$status,'
    '"size":$body_bytes_sent,'
    '"referer":"$http_referer",'
    '"ua":"$http_user_agent",'
    '"rt":$request_time}';

Essential Commands

Basic Analysis

# Count total requests
wc -l access.log

# View last 100 entries
tail -100 access.log

# Follow log in real-time
tail -f access.log

# Search for specific pattern
grep "Googlebot" access.log

# Case-insensitive search
grep -i "error" error.log

Traffic Analysis

# Top 20 IPs
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -20

# Top 20 requested URLs
awk '{print $7}' access.log | sort | uniq -c | sort -rn | head -20

# Requests per hour
awk '{print $4}' access.log | cut -d: -f2 | sort | uniq -c

# Requests per day
awk '{print $4}' access.log | cut -d: -f1 | sort | uniq -c

Response Code Analysis

# Count by status code
awk '{print $9}' access.log | sort | uniq -c | sort -rn

# Find all 404 errors
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn

# Find all 500 errors
awk '$9 >= 500 {print $7, $9}' access.log | sort | uniq -c | sort -rn

Bot Analysis

# Find all bots
grep -i "bot" access.log | awk -F'"' '{print $6}' | sort | uniq -c | sort -rn

# Googlebot requests
grep "Googlebot" access.log | wc -l

# All unique User-Agents
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -rn | head -30

Extracting Insights

Bandwidth Usage

# Total bandwidth (bytes)
awk '{sum += $10} END {print sum/1024/1024 " MB"}' access.log

# Bandwidth by URL
awk '{url[$7] += $10} END {for (u in url) print url[u]/1024/1024 " MB", u}' access.log | sort -rn | head -20

Peak Traffic Times

# Busiest hours
awk '{print $4}' access.log | cut -d: -f2 | sort | uniq -c | sort -rn

# Busiest days
awk '{print $4}' access.log | cut -d[ -f2 | cut -d: -f1 | sort | uniq -c | sort -rn

Automating Analysis

Daily Report Script

#!/bin/bash
LOG="/var/log/nginx/access.log"
DATE=$(date +%Y-%m-%d)

echo "=== Daily Report for $DATE ==="
echo ""
echo "Total Requests: $(wc -l < $LOG)"
echo ""
echo "Top 10 IPs:"
awk '{print $1}' $LOG | sort | uniq -c | sort -rn | head -10
echo ""
echo "Response Codes:"
awk '{print $9}' $LOG | sort | uniq -c | sort -rn
echo ""
echo "Top 10 URLs:"
awk '{print $7}' $LOG | sort | uniq -c | sort -rn | head -10

🎯 Want automated analysis? LogBeast generates comprehensive reports with one click - traffic analysis, bot detection, SEO insights, security alerts, and more. No command line needed.

See it in action with GetBeast tools

Analyze your own server logs and crawl your websites with our professional desktop tools.

Try LogBeast Free Try CrawlBeast Free