
In an audit I ran into a pattern that looks harmless until you look at the network waterfall: two heavy images, **608 KB** and **403 KB**, served over **HTTP/1.1**, with the `Connection: Keep-Alive` header present and **no `Cache-Control` at all**. At first glance someone might think `Keep-Alive` is already "caching" something. It caches nothing. They are two different layers that solve different problems, and mixing them up costs you on every visit.

Let's look at what each header actually does, why `Keep-Alive` helps but is not enough on HTTP/1.1, how HTTP/2 changes things with multiplexing, and where `<link rel="preconnect">` fits in to get ahead of the handshake. At the end there is an interactive demo where you can walk through the four scenarios and see what changes, and what does not, in each one.

## The finding in the audit

Both responses shared the same headers; the one for the heavier image looked like this:

```http
❌ Bad: real audited response
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 622592
Connection: Keep-Alive
```

Two things stand out:

1. **There is `Connection: Keep-Alive`**, so the connection is reused. Useful, but that is the connection; it does not store a single byte for the next visit.
2. **There is no `Cache-Control` anywhere.** This means visitors re-download more than **1 MB** of images on **every visit**, and even when moving between pages if the browser decides to revalidate. A megabyte that could live in cache for months gets downloaded over and over.

The header that really matters for repeat visits was nowhere to be found:

```http
✅ Good: what the origin should serve
HTTP/2 200
content-type: image/jpeg
content-length: 622592
cache-control: public, max-age=31536000, immutable
```

The mental mistake behind this is usually thinking that `Keep-Alive` and `Cache-Control` are about the same thing. They are not. They operate on different layers.

## Network layer vs storage layer

Here is the distinction:

- **`Connection: Keep-Alive` manages the _pipe_.** It lives on the network layer. It decides whether the TCP connection we have already opened stays alive so we can reuse it for the next request, instead of closing it and doing the handshake again. Its effect is measured **within a single page load**.
- **`Cache-Control` manages the _data_.** It lives on the client storage layer. It decides whether the response is stored on disk and for how long, so we do not have to request it again. Its effect is measured **across loads, across pages and across visits**.

An analogy: `Keep-Alive` is keeping the water pipe open so you do not have to reconnect the tap to the mains for every glass. `Cache-Control` is filling a jug and leaving it in the fridge so you do not even open the tap. They are complementary optimizations, not substitutes.

|                | `Connection: Keep-Alive`   | `Cache-Control`                 |
| -------------- | -------------------------- | ------------------------------- |
| Layer          | Network (TCP connection)   | Storage (client cache)          |
| What it reuses | The open pipe              | The bytes already downloaded    |
| Scope          | Within one load            | Across loads and visits         |
| If missing     | Re-handshake per request   | Full re-download of the resource |

In the audited case we had the first but not the second. We were reusing the pipe, but throwing the data away after every visit. The cheapest, highest-impact fix was to add `Cache-Control` with a long `max-age` and `immutable` for versioned assets.

## The impact of Keep-Alive on HTTP/1.1

Leaving caching aside, let's look at what `Keep-Alive` does during that first load, because there is a nuance here that many people take for granted and it is false.

The core problem with HTTP/1.1 is that **a connection handles one request at a time**: there is no multiplexing. To download the two images at once, the browser **opens two connections in parallel** to the assets domain (the limit is around 6 per origin). Each connection pays its own toll: a TCP handshake (SYN, SYN-ACK, ACK) and, over HTTPS, the TLS handshake too. That is two handshakes, in parallel, before it can download.

Here is the nuance: with two concurrent images, **`Connection: close` and `Connection: keep-alive` take the same time on this first load**. In both cases the browser opens two new connections and negotiates two handshakes; there is no previously open connection to assets to reuse. The difference is the lifecycle: with `close` the connections are closed when they finish; with `keep-alive` they stay open in the pool.

So where does `keep-alive` show its value? On the **next** request to that same origin: another image on scroll, a navigation, an API call. That request reuses an open connection and skips the handshake. It is exactly the same misunderstanding as with caching: just as `Keep-Alive` caches nothing, **it does not speed up the first load either**, only the ones after it.

Summing up HTTP/1.1 behavior:

- No multiplexing: one connection, one request at a time. Concurrency capped at around 6 connections per origin.
- `Connection: close`: the connection closes after each response. Zero reuse.
- `Connection: keep-alive`: same cost on the first load, but the connection stays open to reuse later.

## The move to HTTP/2 and the superpower of preconnect

With HTTP/2 this changes. Over **a single connection** it introduces **multiplexing**: each resource travels as an independent _stream_, and many streams share the same connection at the same time. The two images no longer need two connections with two handshakes: they travel over the same one, in parallel, and the 6-connection-per-origin limit goes away.

This has a direct consequence: **`Connection: Keep-Alive` stops making sense on HTTP/2**. The header is not even valid in the protocol; the persistent, multiplexed connection is the default behavior. If you see `Keep-Alive` in a response, it is a hint that you are still on HTTP/1.1.

That said, even on HTTP/2 there is a cost that multiplexing does not remove: **the initial handshake to the assets domain**. The browser does not find out it needs that domain until it parses the HTML and finds the images. Only then does it start resolving DNS, opening TCP and negotiating TLS. Those _round trips_ sneak onto the critical path, right before the downloads.

This is where `<link rel="preconnect">` comes in. It is a _resource hint_ that tells the browser: "you are going to need this origin, start opening the connection now".

```html
<link rel="preconnect" href="https://assets.example.com" crossorigin />
```

Placed in the `<head>`, the browser runs the TCP/TLS handshake to `assets.example.com` **in parallel with downloading and parsing the HTML**. By the time it is ready to request the images, the connection is already open: the handshake disappears from the critical path and the downloads start instantly.

A couple of nuances I checked in the audit:

- The `crossorigin` attribute matters. Use it when the resource is fetched in CORS mode (fonts, or images with `crossorigin`). If the `preconnect`'s `crossorigin` does not match how the resource is requested, the browser opens **two connections** and you waste the hint.
- `preconnect` costs resources (it keeps a socket open). Reserve it for the **2 or 3 truly critical origins**. For the rest, `dns-prefetch` is a cheaper alternative that only resolves DNS.

### See it: the waterfall in four states

Switch between the four scenarios and compare how many TCP connections are opened, how many handshakes are on the critical path and the total time in each case. Pay special attention to where the red handshake bar appears, or disappears.

<figure>
  <iframe
    id="netwf-demo"
    src="/demos/network-waterfall-en.html"
    width="100%"
    height="720"
    style="border: none; border-radius: 8px; display: block;"
    title="Interactive network waterfall comparing HTTP/1.1 with Connection close, HTTP/1.1 with keep-alive, HTTP/2 and HTTP/2 with preconnect for loading two images of 608 KB and 403 KB"
    loading="lazy"
  ></iframe>
  <figcaption>Realistic model: document on the main origin and images on an assets domain. Times are didactic estimates with fixed blocks (handshake 100 ms, download proportional to weight: 608 KB ≈ 220 ms, 403 KB ≈ 150 ms) to isolate the effect of each optimization.</figcaption>
</figure>
<script>
window.addEventListener('message', function(ev) {
  if (ev.data && typeof ev.data.netWaterfallHeight === 'number') {
    var f = document.getElementById('netwf-demo');
    if (f) f.style.height = (ev.data.netWaterfallHeight + 24) + 'px';
  }
});
</script>

The interesting part is what does _not_ change. For two images, `close`, `keep-alive` and HTTP/2 take practically the same time on the first load (around **570 ms** in the model): the protocol change reduces **connections** (from 3 to 2) and **handshakes on the critical path** (from 2 to 1), but does not cut the total time. The only one that cuts it is `preconnect`, which moves the handshake forward and brings it down to **470 ms**. HTTP/2 multiplexing really shines when there are dozens of resources, where HTTP/1.1's 6-connection ceiling starts serializing requests. And note this is only the _first_ visit; with a good `Cache-Control`, the second one does not even touch the network for these images.

## Conclusion and best practices

Going back to the original finding, these were the fixes that applied to that real case:

- **Add `Cache-Control` to the images.** It is the highest-impact change and the one that was missing. For assets with a hash in the name: `cache-control: public, max-age=31536000, immutable`. This targets the layer `Keep-Alive` never touched: repeat visits.
- **Move the origin to HTTP/2** (or HTTP/3). Multiplexing removes application-level Head-of-Line blocking and makes `Keep-Alive` irrelevant. If you still see that header, you are still on HTTP/1.1.
- **Add `preconnect` to the assets domain**, with the correct `crossorigin`, to get ahead of the TCP/TLS handshake. Limit it to the critical origins.
- **And the obvious one: optimize the images.** 608 KB and 403 KB are too much for two images. Modern formats<sup>1</sup> (AVIF, WebP) and correct sizing cut those bytes before the network even comes into play. The best request is the one you never make, and the best bytes are the ones you never send.

We will see `Keep-Alive` less and less: with HTTP/2 and HTTP/3 as the norm, the header does not even apply. But we still run into it in old applications and in infrastructure in countries with limited resources, so it is worth knowing how to read it when it shows up in an audit.

`Keep-Alive` and `Cache-Control` do not compete: they work on different layers. Once you understand that one manages the pipe and the other the data, you stop confusing a reused connection with a cached resource, and you start optimizing both at once.

## Notes

> 1. I call them "modern formats" for consistency with what Lighthouse reports in its tests, but **WebP** was released on [September 30, 2010](https://en.wikipedia.org/wiki/WebP) and **AVIF** on [February 19, 2019](https://en.wikipedia.org/wiki/AVIF). To me, a modern format is [JPEG-XL](https://jpegxl.info/), which I want to write a dedicated post to analyze in depth.
