Adding an AI summary to blog posts with the Chrome Summarizer API

I sometimes come across pages loading external AI models for features the browser could handle natively. The Chrome Summarizer API generates text summaries directly on the device, with no external server requests, no cost per token, and full privacy.

This post covers how I implemented it on this blog: the “AI Summary” button you can see just above this text.

What is the Chrome Summarizer API

The Summarizer API is part of the Chrome Built-in AI APIs, a set of APIs that expose language models running locally in the browser using Gemini Nano.

Unlike integrating the OpenAI API or similar, there is no network request: the model lives on the user’s device. The first use may require downloading the model (a few hundred MB), but subsequent runs are instant.

The API is available in Chrome 131+ and requires enabling the #summarizer-api-for-gemini-nano flag in chrome://flags.

Feature detection

The first step is checking whether the browser supports the API. This is pure progressive enhancement: if there’s no support, nothing is rendered.

// ❌ Bad: assume the API is available
const summarizer = await Summarizer.create();

// ✅ Good: check availability first
if (!("Summarizer" in window)) return;

const availability = await Summarizer.availability({ outputLanguage: "en" });
if (availability === "unavailable") return;

availability() returns three possible values:

"available": the model is downloaded and ready
"downloadable": needs to be downloaded before first use
"unavailable": not supported on this device or browser

Creating the summarizer

const summarizer = await Summarizer.create({
  type: "key-points", // tl;dr | teaser | headline | key-points
  format: "markdown",
  length: "medium",
  outputLanguage: "en",
  sharedContext:
    "Technical blog post about web performance and frontend development",
});

The outputLanguage parameter matters: without it, the browser logs a console warning asking you to specify it.

If the model needs downloading, create() accepts a monitor callback to track progress:

const summarizer = await Summarizer.create({
  // ...options,
  monitor(m) {
    m.addEventListener("downloadprogress", e => {
      const pct = Math.round((e.loaded / e.total) * 100);
      console.log(`Downloading model: ${pct}%`);
    });
  },
});

Streaming for better UX

The API supports streaming, which lets you display the summary as it’s generated rather than waiting for the full result. Chunks are incremental (deltas), so they need to be accumulated:

// ❌ Bad: treat the last chunk as the complete result
for await (const chunk of stream) {
  setSummary(chunk); // loses everything except the last delta
}

// ✅ Good: accumulate deltas
let result = "";
for await (const chunk of stream) {
  result += chunk;
  setSummary(result); // update UI progressively
}

Handling the token limit

The model has an input token limit (inputQuota). For long articles, check it before summarizing and truncate if needed:

const inputUsage = await summarizer.measureInputUsage(markdown);

let textToSummarize = markdown;
if (inputUsage > summarizer.inputQuota) {
  const ratio = summarizer.inputQuota / inputUsage;
  // 0.9 buffer to stay safely under the limit
  textToSummarize = markdown.slice(
    0,
    Math.floor(markdown.length * ratio * 0.9)
  );
}

In the implementation, a notice is shown when the summary covers only part of the article.

Markdown as the data source

Instead of extracting plain text from the DOM, the component fetches the raw markdown at the moment the reader requests the summary:

<!-- PostDetails.astro: only a short URL in the hydration payload -->
<AISummary client:idle markdownUrl={markdownUrl} lang={lang} />

// Fetch happens only when the user requests the summary
const markdown = await fetch(markdownUrl).then(r => r.text());

No cost on initial load, served cached from the CDN. And the model receives structured markdown instead of plain text, which improves summary quality.

Exposing markdown for AI agents

The endpoint already needed for the component also lets AI agents access content in structured format without parsing HTML:

// src/pages/posts/[slug].md.ts
export const GET: APIRoute = ({ props }) => {
  return new Response(props.body, {
    headers: { "Content-Type": "text/markdown; charset=utf-8" },
  });
};

Each post has its markdown version at /posts/slug.md. In the <head> of each post I include the discovery link so agents can find it:

<link rel="alternate" type="text/markdown" href="/posts/slug.md" />

On this blog

The full implementation uses React with Astro (client:idle to avoid blocking the initial load).

A few UX details worth noting:

Summary cache: once generated, hiding and showing the panel again does not regenerate the summary
No download message when the model is already available: the monitor is only attached if availability === "downloadable"
Retry on error: the error state shows a button to regenerate without reloading the page

Conclusion

The Chrome Summarizer API shows how browser-native AI APIs can improve the experience without adding external dependencies, at no cost and respecting privacy. The feature only appears where there is real support: progressive enhancement applied to AI.