Configuring Logrotate for High Traffic Sites: Optimization & Troubleshooting

High-traffic environments generate massive access and error logs that can rapidly exhaust disk I/O and storage, directly disrupting Server Log Fundamentals & Compliance pipelines. Improper rotation causes dropped entries, delayed log shipping, and inaccurate crawl budget calculations. This guide provides precise configuration parameters, script optimization techniques, and validation steps to maintain continuous log capture without service interruption.

Disk I/O Saturation During Peak Hours

Symptom: Server response latency spikes and 5xx errors coincide precisely with cron execution times.

Root Cause: Synchronous compression and file movement block the main I/O thread while the web server continues writing to the active log descriptor.

Exact Fix: Implement delaycompress, missingok, and notifempty alongside create with explicit postrotate signals. Schedule rotation during off-peak windows using systemd timers instead of legacy cron. Reference Log Rotation Strategies for advanced scheduling patterns that decouple archival from active I/O.

Validation: Monitor iostat -x 1 during rotation windows; confirm iowait remains < 10% and zero dropped packets appear in access logs.

Log Gaps & Missing Crawl Bot Entries

Symptom: Googlebot and Bingbot requests disappear from daily analysis reports immediately after rotation cycles.

Root Cause: File descriptor mismatch occurs when the web server continues writing to the renamed .1 file instead of the newly created log.

Exact Fix: Replace copytruncate with create 0640 www-data adm and enforce a mandatory postrotate block executing systemctl reload nginx or apache2ctl graceful. Ensure sharedscripts wraps the reload command to prevent duplicate signals across multiple log files.

Validation: Grep recent timestamps in the new log file immediately post-rotation; verify continuous bot user-agent strings without chronological gaps or duplicate entries.

Compression Overhead & CPU Throttling

Symptom: High CPU utilization and thermal throttling during log archival, impacting real-time request processing.

Root Cause: Default gzip compression runs synchronously on large multi-gigabyte files, consuming excessive CPU cycles and blocking worker threads.

Exact Fix: Switch to compresscmd /usr/bin/zstd with compressext .zst and compressoptions -3 --long. Set maxsize 500M to trigger rotation before files become unmanageable, and enable delaycompress to defer compression to the next cycle.

Validation: Execute top -p $(pgrep logrotate) during execution; confirm CPU usage remains below 15% and archive completion time drops by >60%.


Production Configuration Templates

Nginx High-Traffic Logrotate Configuration

/var/log/nginx/*.log {
 daily
 rotate 14
 missingok
 notifempty
 compress
 delaycompress
 maxsize 500M
 create 0640 www-data adm
 sharedscripts
 postrotate
 [ -f /var/run/nginx.pid ] && kill -USR1 $(cat /var/run/nginx.pid)
 endscript
}

Context: Optimized for high-concurrency environments; uses USR1 signal for zero-downtime log switching.

Apache Graceful Reload with Shared Scripts

/var/log/apache2/*.log {
 weekly
 rotate 52
 compress
 delaycompress
 missingok
 notifempty
 create 640 root adm
 sharedscripts
 postrotate
 if invoke-rc.d apache2 status > /dev/null 2>&1; then
 invoke-rc.d apache2 reload > /dev/null
 fi
 endscript
}

Context: Prevents multiple graceful reloads when rotating multiple vhost logs simultaneously.

Systemd Timer Override for Off-Peak Rotation

[Unit]
Description=Run logrotate for high-traffic logs

[Timer]
OnCalendar=*-*-* 03:00:00
AccuracySec=1m
Persistent=true

[Install]
WantedBy=timers.target

Context: Replaces cron to ensure precise execution windows and prevents overlapping I/O during peak traffic.


Common Mistakes to Avoid

  • Relying on daily without maxsize for burst traffic, causing uncontrolled file growth.
  • Omitting sharedscripts, triggering multiple service reloads and temporary connection drops.
  • Using copytruncate on high-concurrency APIs, resulting in race conditions and partial writes.
  • Ignoring inode exhaustion from uncompressed .1 files on small-block filesystems.
  • Failing to test postrotate scripts in staging, leading to silent log pipeline failures.

Frequently Asked Questions

Should I use copytruncate or create for high-traffic Nginx servers?
Always use create with a postrotate reload signal. copytruncate introduces a race condition where milliseconds of traffic are lost during the copy operation, corrupting crawl budget metrics and log analysis accuracy.

How do I prevent logrotate from blocking during peak hours?
Implement delaycompress to defer CPU-intensive compression to the next cycle, use maxsize to trigger rotation before files exceed 500MB, and schedule execution via systemd timers during verified low-traffic windows.

Why are my log analysis pipelines missing bot traffic after rotation?
Missing bot traffic indicates a file descriptor mismatch. The web server is writing to the old rotated file. Enforce create with proper permissions and ensure the postrotate block successfully signals the web server to open a new file descriptor.

Is zstd compression safe for logrotate in production?
Yes. Zstandard (zstd) offers superior compression ratios and significantly lower CPU overhead compared to gzip. Use compresscmd /usr/bin/zstd with compressoptions -3 for an optimal balance between storage savings and I/O performance.