WebPerf Snippets agent-first: structured returns and minimal tokens

In the previous article about WebPerf Snippets and Agent SKILLs I explained how the diagnostic scripts are organized as deterministic tools that the agent reads and executes in the browser. The next logical question is: what does the agent do with the result?

The answer, until recently, was: parse console text.

The problem with writing for humans when the reader is an agent

WebPerf Snippets write messages like this to the console:

🟢 LCP: 1.24s (good)
   Element: img.hero
   URL: hero.avif

For someone opening DevTools manually, that’s perfect. Colors, emojis, visual hierarchy. But when the consumer is an AI agent executing the script via evaluate_script() and receiving the output as plain text, that format stops being an advantage.

The agent has to locate the numeric value within the string, interpret the rating from the emoji, extract additional attributes from variably indented lines. Fragile parsing on top of a format designed for human reading.

This situation is well documented in Rewrite your CLI for AI agents: interfaces for humans and interfaces for agents have different requirements. What makes a CLI readable for people (color, emojis, column alignment) generates unnecessary noise when the consumer is an LLM.

Structured returns from evaluate_script()

The core change is that each script now returns a JSON object from the IIFE itself:

// Before: agent had to parse "🟢 LCP: 1.24s (good)" from console
// After:
{
  script: "LCP",
  status: "ok",
  metric: "LCP",
  value: 1240,
  unit: "ms",
  rating: "good",
  thresholds: { good: 2500, needsImprovement: 4000 },
  details: {
    element: "img.hero",
    url: "hero.avif",
    sizePixels: 756000
  }
}

The agent reads the return value directly from evaluate_script(). No parsing. No dependency on console format.

The schema is documented in the repository and distinguishes three types of script:

Metric scripts → { value, unit, rating, thresholds, details }
Audit scripts → { count, items[], issues[] }
Tracking scripts → { status: "tracking", getDataFn } with a window.getFn() function the agent can call after user interaction
Early returns (no data, unsupported API) → { status: "error"|"unsupported", error }

Console output doesn’t change. The JSON return is additive: scripts remain useful in manual DevTools and are now also directly consumable by agents.

Agent-first build with terser

If agents don’t need the console output, there’s no reason for that code to consume tokens in the scripts they load. The solution: strip it at build time.

The skills generator, scripts/generate-skills.js, now minifies each script with terser instead of copying it as-is:

// Before: copyFileSync(src, dest)
// After:
const result = await minify(source, {
  compress: {
    drop_console: true,
    pure_getters: true,
    passes: 2,
  },
  mangle: true,
  format: { comments: false },
});

drop_console: true eliminates all console.* calls. A snippet of 80 lines that spends half of them on console formatting becomes a single line of ~300 characters.

If minification fails, the build falls back to the literal copy. The process never breaks.

Dead code elimination: pure_getters + passes: 2

drop_console: true removes the console calls but leaves behind constants that were only used by those calls. Several scripts have a RATING object with emoji icons and colors for console.group():

const RATING = {
  good: { icon: "🟢", color: "#0CCE6A" },
  "needs-improvement": { icon: "🟡", color: "#FFA400" },
  poor: { icon: "🔴", color: "#FF4E42" },
};
const { icon, color } = RATING[rating];
console.group(`${icon} LCP: ${value}ms (${rating})`); // ← removed by drop_console

After the console.group() is removed, icon and color become unused. But terser, in its first pass, can’t determine that accessing RATING[rating] has no side effects, so it keeps the destructuring.

Two additional options solve this:

pure_getters: true: tells terser that property accesses (obj.prop, obj[key]) have no side effects, so destructured results can be dropped if unused.
passes: 2: runs a second compression pass to eliminate secondary dead code — variables with no consumers whose only dependency was already removed in the first pass.

The result: the entire RATING object, its destructuring, and its references disappear from the generated script.

// ❌ Without pure_getters + passes: 2 — RATING survives
const t = { good: { icon: "🟢", color: "#0CCE6A" }, ... }
const { icon: c, color: m } = t[l]  // c and m unused, but terser keeps them

// ✅ With pure_getters + passes: 2 — fully eliminated
(e(n), o.element)  // only the side-effecting parts remain

Token impact

A snippet with elaborate console output (80 lines) becomes a single line of ~300 characters. With ~50 scripts distributed across 6 SKILLs, the cumulative reduction is significant when the agent loads several scripts in an analysis session.

Eliminating the RATING objects reduces each script further, removing constants that existed exclusively for visual formatting.

Traceability: every generated script points to its source

Each generated script includes a one-line header:

// snippets/CoreWebVitals/LCP.js | sha256:a3f1b2c9d4e5f678 | https://github.com/nucliweb/webperf-snippets/blob/main/snippets/CoreWebVitals/LCP.js
(()=>{...minified code...})();

The SHA-256 is computed from the original source file, not the minified output. This allows verifying at any time that the script installed in ~/.claude/skills/ corresponds exactly to a specific version of the snippet:

shasum -a 256 snippets/CoreWebVitals/LCP.js | cut -c1-16
# should match the hash in .claude/skills/webperf-core-web-vitals/scripts/LCP.js

Conclusion

The changes to returns and build are small in code but relevant in design. Diagnostic scripts now have two interfaces: the visual one (in the console) for manual auditing; and the structured one (in the return value) for agents consuming them programmatically. The build strips the first when generating the skills, leaving only what the agent needs. Fewer tokens, directly actionable data, traceability back to the source.