Setting Up a GoAccess Real-Time Dashboard on Ubuntu

A GoAccess real-time dashboard turns a raw Nginx or Apache access log into a live HTML view that updates as requests arrive, so you can watch crawl activity, status-code spikes, and bot traffic the moment they happen instead of after a batch report. The catch is that a careless setup either silently parses nothing — empty graphs — or exposes a WebSocket port to the public internet. This guide deploys a persistent, hardened GoAccess WebSocket dashboard on Ubuntu 22.04/24.04 that survives log rotation and service crashes.

The objective is a systemd-managed daemon that ingests the combined log format correctly, streams live updates over a WebSocket bound to localhost, and isolates search-engine crawler traffic so you can read crawl budget signals directly. It builds on the broader log parsing workflows and CLI toolchains and pairs naturally with the Node.js and GoAccess integration patterns for alerting on top of the same stream.

Diagnosis: Empty Graphs and Port Conflicts

The two failure modes you hit on day one are an all-zero dashboard (the log format string does not match your log) and a refused WebSocket connection (something already owns the port). Confirm both before you write a service unit.

First, look at a real line so you can match the format exactly rather than guessing:

head -n 1 /var/log/nginx/access.log

Expected Output:

66.249.66.1 - - [15/Mar/2024:10:12:00 +0000] "GET /products HTTP/1.1" 200 5120 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

That +0000 timezone field and the bracketed timestamp are exactly where naive format strings break. Note the shape now; you will mirror it in log-format below.

Next, confirm the default WebSocket port 7890 is free. A bound port produces a dashboard that loads but never receives live frames:

ss -tulpn | grep 7890

Expected Output:

(no output)

No output means the port is free. If a row prints, terminate the conflicting process or pick another port in the config. You can pair this quick check with the broader sweep techniques in the CLI one-liners for quick audits guide.

Concept: Why Real-Time Mode Needs a Persistent Process

Static GoAccess runs once and exits. Real-time mode (real-time-html) keeps the process alive, tailing the log with inotify and pushing each new line to connected browsers over a WebSocket. That changes two things. First, the process must be supervised — if it dies, the dashboard freezes at the last frame with no error. Second, the WebSocket endpoint is a network listener, and anything it can reach can read your traffic data. The correct architecture is a supervised daemon bound to 127.0.0.1, fronted by an authenticated reverse proxy, never a raw port on 0.0.0.0.

The parsing side hinges on three directives working together: date-format, time-format, and log-format. GoAccess matches each line against the combined log-format template; any line that does not match is silently dropped. The %^ token discards a field, which is how you skip the timezone offset and the identity/auth fields that the combined format reserves but rarely populates.

Step-by-Step: Deploy the Dashboard

Step 1: Install GoAccess from the official Ubuntu repository. The distro package is recent enough for real-time HTML on 22.04 and 24.04.

sudo apt-get update && sudo apt-get install -y goaccess
goaccess --version | head -n 1

Expected Output:

GoAccess - 1.6.5

Step 2: Write a strict parsing config. Create /etc/goaccess/goaccess.conf. The log-format below matches the standard Nginx combined format; %^ skips the timezone offset so the bracketed timestamp parses cleanly, and ws-url plus bind-addr keep the WebSocket on localhost only.

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"
real-time-html true
ws-url ws://127.0.0.1:7890
addr 127.0.0.1
port 7890
bind-addr 127.0.0.1

Explanation: date-format/time-format decode the [15/Mar/2024:10:12:00 +0000] stamp; the first %^ skips nothing structural but the trailing %^ inside the brackets discards the +0000 offset. Binding to 127.0.0.1 is the single most important hardening line.

Production Warning: Never set addr, bind-addr, or ws-url to 0.0.0.0 or a public IP. An exposed GoAccess WebSocket port serves your full traffic stream — IPs, URLs, user agents — to anyone who connects, with no authentication. Always bind to 127.0.0.1 and expose the dashboard only through an authenticated HTTPS reverse proxy (covered in Verification).

Step 3: Validate the format against a sample before daemonizing. Run GoAccess once in batch mode; a non-empty report proves the format matches.

goaccess /var/log/nginx/access.log \
  --config-file=/etc/goaccess/goaccess.conf \
  -o /tmp/test-report.html --no-global-config
grep -c "Requests" /tmp/test-report.html

Expected Output:

1

A count of 0 means the log-format does not match your log — recheck the timestamp shape from Diagnosis before continuing. A successful match here is also where you confirm status codes parse, which feeds the understanding of HTTP status codes in server logs that the dashboard surfaces.

Step 4: Create a supervised systemd service. Write /etc/systemd/system/goaccess-realtime.service so the daemon restarts on crash and on rotation-triggered reloads, running as the unprivileged www-data user.

[Unit]
Description=GoAccess Real-Time Log Dashboard
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/goaccess /var/log/nginx/access.log \
  --config-file=/etc/goaccess/goaccess.conf \
  -o /var/www/html/report.html
Restart=always
RestartSec=5
User=www-data

[Install]
WantedBy=multi-user.target

Step 5: Enable and start the service. daemon-reload picks up the new unit; enable --now starts it and sets it to boot automatically.

sudo systemctl daemon-reload
sudo systemctl enable --now goaccess-realtime
systemctl is-active goaccess-realtime

Expected Output:

active

Once live, you can bridge these metrics into the Node.js and GoAccess integration layer for alert thresholds, or correlate the hourly view against a full measurement of crawl rate by hour from your logs.

Edge-Case Handling

Log rotation severs the file descriptor. When logrotate renames the access log, GoAccess keeps reading the now-deleted inode and the dashboard stalls. Signal it to reopen the file on rotation. Add this to /etc/logrotate.d/nginx:

/var/log/nginx/access.log {
    daily
    rotate 14
    compress
    delaycompress
    missingok
    postrotate
        kill -USR1 $(pgrep -f goaccess) 2>/dev/null || true
        systemctl reload goaccess-realtime 2>/dev/null || true
    endscript
}

The SIGUSR1 signal forces GoAccess to reopen the log descriptor after rotation; systemctl reload is a fallback if the PID lookup fails. The delaycompress flag keeps the previous file readable for one cycle so no lines are lost mid-rotation. For deeper rotation tuning, see the log rotation strategies cluster.

Crawler noise drowns the panels. A heavily crawled site fills the dashboard with bot hits that obscure human traffic. To read crawl budget in isolation, pre-filter the stream rather than parsing everything:

grep -i 'googlebot' /var/log/nginx/access.log | \
  goaccess --config-file=/etc/goaccess/goaccess.conf -o /var/www/html/googlebot.html -

This produces a Googlebot-only view; piping a pre-filtered stream also cuts CPU on high-traffic hosts.

Verification: Confirm the Live Stream and Secure Access

Confirm the WebSocket handshake completes and the dashboard receives live frames. Generate a request, then watch for the update:

curl -s -o /dev/null -w "%{http_code}\n" http://localhost/

Expected Output:

200

Open the report in a browser, open developer tools, and filter the Network tab by WS. A single WebSocket connection to 127.0.0.1:7890 should show inbound frames carrying JSON payloads each time a new request hits the log. To expose the dashboard safely, front it with an authenticated Nginx reverse proxy over HTTPS:

location /goaccess/ {
    auth_basic "Restricted";
    auth_basic_user_file /etc/nginx/.htpasswd;
    proxy_pass http://127.0.0.1:7890/;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

Explanation: the proxy terminates TLS and enforces HTTP basic auth while the GoAccess port stays on localhost, so the live data is never directly reachable.

Common Mistakes

  • Binding the WebSocket to a public address. Setting ws-url or bind-addr to 0.0.0.0 exposes the raw traffic stream with no authentication. Always bind to 127.0.0.1 and reverse-proxy with auth.
  • A log-format that silently does not match. GoAccess drops non-matching lines without error, so a wrong timestamp or quoting produces empty graphs. Validate in batch mode (Step 3) before daemonizing, and mirror the exact line shape from the Diagnosis output.
  • No rotation signal. Without a postrotate kill -USR1 hook, GoAccess reads a deleted inode after rotation and the dashboard freezes until a manual restart. Wire the signal into logrotate.

Frequently Asked Questions

How do I restrict GoAccess WebSocket access to internal users only?
Bind the WebSocket to 127.0.0.1 with bind-addr 127.0.0.1 and ws-url ws://127.0.0.1:7890, then expose the dashboard exclusively through an Nginx reverse proxy that enforces HTTPS and auth_basic. The GoAccess port itself should never be reachable from outside the host.

Does real-time mode hurt server performance during traffic spikes?
Impact is minimal. GoAccess uses inotify to stream only new log lines rather than re-reading the file, keeping CPU overhead typically below 2% on a standard VPS. Pre-filtering with grep before piping further reduces work during peaks.

How can I isolate Googlebot traffic for crawl budget analysis?
Pipe the log through grep -i googlebot before feeding GoAccess for a targeted view, or use the --http-user-agent filter to narrow the dashboard to a specific user-agent pattern. The isolated view makes crawl-rate and status-code signals legible without bot noise.

Part of the Node.js and GoAccess Integration series.