I sometimes come across pages loading external AI models for features the browser could handle natively. The Chrome Summarizer API generates text summaries directly on the device, with no external server requests, no cost per token, and full privacy.
This post covers how I implemented it on this blog: the “AI Summary” button you can see just above this text.
What is the Chrome Summarizer API
The Summarizer API is part of the Chrome Built-in AI APIs, a set of APIs that expose language models running locally in the browser using Gemini Nano.
Unlike integrating the OpenAI API or similar, there is no network request: the model lives on the user’s device. The first use may require downloading the model (a few hundred MB), but subsequent runs are instant.
The API is available in Chrome 131+ and requires enabling the
#summarizer-api-for-gemini-nanoflag inchrome://flags.
Feature detection
The first step is checking whether the browser supports the API. This is pure progressive enhancement: if there’s no support, nothing is rendered.
// ❌ Bad: assume the API is available
const summarizer = await Summarizer.create();
// ✅ Good: check availability first
if (!("Summarizer" in window)) return;
const availability = await Summarizer.availability({ outputLanguage: "en" });
if (availability === "unavailable") return;
availability() returns three possible values:
"available": the model is downloaded and ready"downloadable": needs to be downloaded before first use"unavailable": not supported on this device or browser
Creating the summarizer
const summarizer = await Summarizer.create({
type: "key-points", // tl;dr | teaser | headline | key-points
format: "markdown",
length: "medium",
outputLanguage: "en",
sharedContext:
"Technical blog post about web performance and frontend development",
});
The outputLanguage parameter matters: without it, the browser logs a console warning asking you to specify it.
If the model needs downloading, create() accepts a monitor callback to track progress:
const summarizer = await Summarizer.create({
// ...options,
monitor(m) {
m.addEventListener("downloadprogress", e => {
const pct = Math.round((e.loaded / e.total) * 100);
console.log(`Downloading model: ${pct}%`);
});
},
});
Streaming for better UX
The API supports streaming, which lets you display the summary as it’s generated rather than waiting for the full result. Chunks are incremental (deltas), so they need to be accumulated:
// ❌ Bad: treat the last chunk as the complete result
for await (const chunk of stream) {
setSummary(chunk); // loses everything except the last delta
}
// ✅ Good: accumulate deltas
let result = "";
for await (const chunk of stream) {
result += chunk;
setSummary(result); // update UI progressively
}
Handling the token limit
The model has an input token limit (inputQuota). For long articles, check it before summarizing and truncate if needed:
const inputUsage = await summarizer.measureInputUsage(markdown);
let textToSummarize = markdown;
if (inputUsage > summarizer.inputQuota) {
const ratio = summarizer.inputQuota / inputUsage;
// 0.9 buffer to stay safely under the limit
textToSummarize = markdown.slice(
0,
Math.floor(markdown.length * ratio * 0.9)
);
}
In the implementation, a notice is shown when the summary covers only part of the article.
Markdown as the data source
Instead of extracting plain text from the DOM, the component fetches the raw markdown at the moment the reader requests the summary:
<!-- PostDetails.astro: only a short URL in the hydration payload -->
<AISummary client:idle markdownUrl={markdownUrl} lang={lang} />
// Fetch happens only when the user requests the summary
const markdown = await fetch(markdownUrl).then(r => r.text());
No cost on initial load, served cached from the CDN. And the model receives structured markdown instead of plain text, which improves summary quality.
Exposing markdown for AI agents
The endpoint already needed for the component also lets AI agents access content in structured format without parsing HTML:
// src/pages/posts/[slug].md.ts
export const GET: APIRoute = ({ props }) => {
return new Response(props.body, {
headers: { "Content-Type": "text/markdown; charset=utf-8" },
});
};
Each post has its markdown version at /posts/slug.md. In the <head> of each post I include the discovery link so agents can find it:
<link rel="alternate" type="text/markdown" href="/posts/slug.md" />
On this blog
The full implementation uses React with Astro (client:idle to avoid blocking the initial load).
A few UX details worth noting:
- Summary cache: once generated, hiding and showing the panel again does not regenerate the summary
- No download message when the model is already available: the
monitoris only attached ifavailability === "downloadable" - Retry on error: the error state shows a button to regenerate without reloading the page
Conclusion
The Chrome Summarizer API shows how browser-native AI APIs can improve the experience without adding external dependencies, at no cost and respecting privacy. The feature only appears where there is real support: progressive enhancement applied to AI.