Parsing Malformed JSON Logs with Vector VRL

JSON access logs are only structured until they aren't. A truncated write during log rotation, a proxy that injects a raw IP prefix, or an application that occasionally logs a bare string all produce lines that break a naive parser. In Vector, a single unparseable line handled with the fallible parse_json! will abort the remap transform and stall the pipeline. This guide shows how to parse JSON access logs defensively in Vector Remap Language (VRL), routing failures to a dead-letter sink instead of crashing, as part of the broader Vector.dev Pipeline Configuration workflow.

You will replace the abort-on-error parse_json! with the non-aborting parse_json and the , err = pattern, reroute dropped events to a dead-letter stream, and coerce fields (status to integer, timestamp to a real timestamp) so every event reaching your SEO analytics is clean and typed.

Diagnosis: One Bad Line Aborts the Transform

Your pipeline runs fine until a malformed line arrives, then vector top shows the transform's error counter climbing and downstream sinks starved. Tap the transform to see the raw culprit.

vector tap parse_json_logs --format json | head -n 2

Expected Output:

{"message":"{\"remote_addr\":\"66.249.66.1\",\"status\":\"200\",\"path\":\"/sitemap.xml\"}"}
{"message":"66.249.66.1 - - GET /robots.txt 200"}

The first line is valid JSON; the second is a plain Combined-style line that slipped into the same stream. Confirm the failure rate with the metrics endpoint:

curl -s http://localhost:8686/metrics | grep component_errors_total | grep parse_json_logs

Expected Output:

vector_component_errors_total{component_id="parse_json_logs",error_type="conversion_failed"} 137

A non-zero conversion_failed count means parse_json! is aborting on those 137 lines. With the default behavior, each abort drops the event and increments the error counter; worse, an unhandled abort can stall the transform under load.

Concept: Fallible vs Infallible VRL Functions

VRL marks functions that can fail as fallible. parse_json is fallible because input may not be valid JSON. You have three ways to handle it:

Form Behavior on bad input Use when
parse_json!(.message) Aborts the whole remap program for that event You are certain every line is valid JSON
., err = parse_json(.message) Returns the error in err; you branch on it You want to recover or reroute
parse_json(.message) ?? {} Falls back to a default on error A sane default exists and loss is acceptable

The , err = pattern is the workhorse for malformed-log handling: it never aborts, so you keep full control over what happens to the failed event.

Step-by-Step Fix

Step 1: Replace the abort form with error capture. Swap parse_json! for the two-return-value parse_json so a bad line yields an err instead of aborting.

[transforms.parse_json_logs]
type = "remap"
inputs = ["web_access"]
reroute_dropped = true
source = '''
  parsed, err = parse_json(.message)
  if err != null {
    log("dropping unparseable line: " + err, level: "warn")
    abort
  }
  . = object!(parsed)
'''

Explanation: When parse_json fails, err is non-null and we abort. Because reroute_dropped = true is set, aborted events are not discarded; they are emitted on the transform's .dropped output for a dead-letter sink. The object!(parsed) assertion guarantees the merged event is a map, not a bare string or array.

Step 2: Route dropped events to a dead-letter sink. Aborted events flow to the <transform-id>.dropped port. Wire it to a separate sink so you can inspect and replay malformed lines instead of losing them.

[sinks.dead_letter]
type = "file"
inputs = ["parse_json_logs.dropped"]
path = "/var/log/vector/malformed-%Y-%m-%d.log"
encoding.codec = "json"

Expected Output (a line written to the dead-letter file):

{"message":"66.249.66.1 - - GET /robots.txt 200","metadata":{"dropped":{"reason":"abort","component_id":"parse_json_logs"}}}

The non-JSON line is preserved verbatim with drop metadata, so you can fix the upstream source or re-parse it later.

Production Warning: Without reroute_dropped = true, an abort permanently deletes the event. On a crawl-analysis pipeline that means silently discarding the exact Googlebot or Bingbot hits the malformed lines may contain, skewing crawl-budget totals with no recoverable trace. Always pair abort with reroute_dropped and a dead-letter sink, and alert on the dead-letter file's growth rate so a format change upstream does not quietly erode your data.

Step 3: Coerce and normalize fields after a successful parse. Valid JSON still arrives with strings where you need numbers and timestamps. Cast status to an integer and parse the timestamp to a real timestamp type so downstream filters and time-window aggregations work.

source = '''
  parsed, err = parse_json(.message)
  if err != null { abort }
  . = object!(parsed)

  .status = to_int(.status) ?? 0
  .timestamp = parse_timestamp(.timestamp, format: "%Y-%m-%dT%H:%M:%S%z") ?? now()
  .is_bot = match(string!(.user_agent), r'(?i)googlebot|bingbot|duckduckbot')
'''

Explanation: to_int(.status) ?? 0 converts the string "200" to the integer 200, defaulting to 0 on a non-numeric value so a status filter never crashes. parse_timestamp(...) ?? now() yields a usable timestamp even when the field is missing. The match flags crawler user agents for routing.

Step 4: Validate the typed output. Tap the transform and confirm types and routing.

vector tap parse_json_logs --format json | head -n 1

Expected Output:

{"remote_addr":"66.249.66.1","status":200,"path":"/sitemap.xml","is_bot":true,"timestamp":"2026-06-19T08:14:02Z"}

status is now the integer 200 (no quotes) and timestamp is a real timestamp, while the malformed line has been diverted to the dead-letter file.

Edge-Case Handling

Double-encoded JSON (a JSON string inside the message). Some shippers wrap the access log in an outer envelope, so parse_json yields a string that is itself JSON. Detect it and parse again: if is_string(parsed) { parsed, err = parse_json(string!(parsed)) }. Guard the second parse with its own err check so a single-encoded line does not abort.

Partial objects missing required fields. A truncated write can produce valid JSON that lacks status or path. Rather than abort, default the field and tag the event: if !exists(.status) { .status = 0; .partial = true }. Routing partial == true events to the dead-letter sink keeps your primary index clean while preserving the salvageable data for review. For deeper field-mapping patterns, see how to parse JSON access logs with jq, which mirrors the same normalization logic at the command line.

Verification

Confirm malformed lines are rerouted and not lost by comparing the parse error counter against the dead-letter sink's received count.

curl -s http://localhost:8686/metrics | grep -E 'parse_json_logs.*dropped|dead_letter.*received'

Expected Output:

vector_component_sent_events_total{component_id="parse_json_logs",output="dropped"} 137
vector_component_received_events_total{component_id="dead_letter"} 137

The 137 previously-erroring lines now match exactly between the transform's dropped output and the dead-letter sink's received count: every malformed line is accounted for, none silently discarded.

Common Mistakes

  • Using parse_json! in production. The bang form aborts the program on the first bad line. Under a traffic spike with even a small malformed fraction, this stalls the transform. Use parse_json with , err = and handle the error explicitly.
  • Calling abort without reroute_dropped = true. This deletes the event permanently with no trace. Always enable rerouting and attach a dead-letter sink so failures are inspectable and replayable.
  • Leaving status as a string. Numeric comparisons and range filters (.status >= 400) silently misbehave on string status codes. Coerce with to_int(...) ?? 0 immediately after parsing.

Frequently Asked Questions

What is the difference between parse_json and parse_json! in VRL?
parse_json! is the fallible-abort form: on invalid input it aborts the entire remap program for that event. parse_json returns two values, the result and an error, so you can branch on the error and recover or reroute instead of crashing the transform.

How do I keep malformed lines instead of dropping them?
Set reroute_dropped = true on the remap transform and connect a sink to its .dropped output. Events that hit abort are then emitted on that port and written to your dead-letter sink rather than discarded, so you can review or replay them.

Why coerce status codes to integers in VRL?
JSON logs encode status as a string by default. Range filters and numeric aggregations like .status >= 400 behave incorrectly on strings. Casting with to_int(.status) ?? 0 guarantees a numeric type and a safe default for any non-numeric value.

Part of the Vector.dev Pipeline Configuration series.