Keep-Alive is not Cache-Control: anatomy of a network waterfall

In an audit I ran into a pattern that looks harmless until you look at the network waterfall: two heavy images, 608 KB and 403 KB, served over HTTP/1.1, with the Connection: Keep-Alive header present and no Cache-Control at all. At first glance someone might think Keep-Alive is already “caching” something. It caches nothing. They are two different layers that solve different problems, and mixing them up costs you on every visit.

Let’s look at what each header actually does, why Keep-Alive helps but is not enough on HTTP/1.1, how HTTP/2 changes things with multiplexing, and where <link rel="preconnect"> fits in to get ahead of the handshake. At the end there is an interactive demo where you can walk through the four scenarios and see what changes, and what does not, in each one.

The finding in the audit

Both responses shared the same headers; the one for the heavier image looked like this:

❌ Bad: real audited response
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 622592
Connection: Keep-Alive

Two things stand out:

There is Connection: Keep-Alive, so the connection is reused. Useful, but that is the connection; it does not store a single byte for the next visit.
There is no Cache-Control anywhere. This means visitors re-download more than 1 MB of images on every visit, and even when moving between pages if the browser decides to revalidate. A megabyte that could live in cache for months gets downloaded over and over.

The header that really matters for repeat visits was nowhere to be found:

✅ Good: what the origin should serve
HTTP/2 200
content-type: image/jpeg
content-length: 622592
cache-control: public, max-age=31536000, immutable

The mental mistake behind this is usually thinking that Keep-Alive and Cache-Control are about the same thing. They are not. They operate on different layers.

Network layer vs storage layer

Here is the distinction:

Connection: Keep-Alive manages the pipe. It lives on the network layer. It decides whether the TCP connection we have already opened stays alive so we can reuse it for the next request, instead of closing it and doing the handshake again. Its effect is measured within a single page load.
Cache-Control manages the data. It lives on the client storage layer. It decides whether the response is stored on disk and for how long, so we do not have to request it again. Its effect is measured across loads, across pages and across visits.

An analogy: Keep-Alive is keeping the water pipe open so you do not have to reconnect the tap to the mains for every glass. Cache-Control is filling a jug and leaving it in the fridge so you do not even open the tap. They are complementary optimizations, not substitutes.

	`Connection: Keep-Alive`	`Cache-Control`
Layer	Network (TCP connection)	Storage (client cache)
What it reuses	The open pipe	The bytes already downloaded
Scope	Within one load	Across loads and visits
If missing	Re-handshake per request	Full re-download of the resource

In the audited case we had the first but not the second. We were reusing the pipe, but throwing the data away after every visit. The cheapest, highest-impact fix was to add Cache-Control with a long max-age and immutable for versioned assets.

The impact of Keep-Alive on HTTP/1.1

Leaving caching aside, let’s look at what Keep-Alive does during that first load, because there is a nuance here that many people take for granted and it is false.

The core problem with HTTP/1.1 is that a connection handles one request at a time: there is no multiplexing. To download the two images at once, the browser opens two connections in parallel to the assets domain (the limit is around 6 per origin). Each connection pays its own toll: a TCP handshake (SYN, SYN-ACK, ACK) and, over HTTPS, the TLS handshake too. That is two handshakes, in parallel, before it can download.

Here is the nuance: with two concurrent images, Connection: close and Connection: keep-alive take the same time on this first load. In both cases the browser opens two new connections and negotiates two handshakes; there is no previously open connection to assets to reuse. The difference is the lifecycle: with close the connections are closed when they finish; with keep-alive they stay open in the pool.

So where does keep-alive show its value? On the next request to that same origin: another image on scroll, a navigation, an API call. That request reuses an open connection and skips the handshake. It is exactly the same misunderstanding as with caching: just as Keep-Alive caches nothing, it does not speed up the first load either, only the ones after it.

Summing up HTTP/1.1 behavior:

No multiplexing: one connection, one request at a time. Concurrency capped at around 6 connections per origin.
Connection: close: the connection closes after each response. Zero reuse.
Connection: keep-alive: same cost on the first load, but the connection stays open to reuse later.

The move to HTTP/2 and the superpower of preconnect

With HTTP/2 this changes. Over a single connection it introduces multiplexing: each resource travels as an independent stream, and many streams share the same connection at the same time. The two images no longer need two connections with two handshakes: they travel over the same one, in parallel, and the 6-connection-per-origin limit goes away.

This has a direct consequence: Connection: Keep-Alive stops making sense on HTTP/2. The header is not even valid in the protocol; the persistent, multiplexed connection is the default behavior. If you see Keep-Alive in a response, it is a hint that you are still on HTTP/1.1.

That said, even on HTTP/2 there is a cost that multiplexing does not remove: the initial handshake to the assets domain. The browser does not find out it needs that domain until it parses the HTML and finds the images. Only then does it start resolving DNS, opening TCP and negotiating TLS. Those round trips sneak onto the critical path, right before the downloads.

This is where <link rel="preconnect"> comes in. It is a resource hint that tells the browser: “you are going to need this origin, start opening the connection now”.

<link rel="preconnect" href="https://assets.example.com" crossorigin />

Placed in the <head>, the browser runs the TCP/TLS handshake to assets.example.com in parallel with downloading and parsing the HTML. By the time it is ready to request the images, the connection is already open: the handshake disappears from the critical path and the downloads start instantly.

A couple of nuances I checked in the audit:

The crossorigin attribute matters. Use it when the resource is fetched in CORS mode (fonts, or images with crossorigin). If the preconnect’s crossorigin does not match how the resource is requested, the browser opens two connections and you waste the hint.
preconnect costs resources (it keeps a socket open). Reserve it for the 2 or 3 truly critical origins. For the rest, dns-prefetch is a cheaper alternative that only resolves DNS.

See it: the waterfall in four states

Switch between the four scenarios and compare how many TCP connections are opened, how many handshakes are on the critical path and the total time in each case. Pay special attention to where the red handshake bar appears, or disappears.

Realistic model: document on the main origin and images on an assets domain. Times are didactic estimates with fixed blocks (handshake 100 ms, download proportional to weight: 608 KB ≈ 220 ms, 403 KB ≈ 150 ms) to isolate the effect of each optimization.

The interesting part is what does not change. For two images, close, keep-alive and HTTP/2 take practically the same time on the first load (around 570 ms in the model): the protocol change reduces connections (from 3 to 2) and handshakes on the critical path (from 2 to 1), but does not cut the total time. The only one that cuts it is preconnect, which moves the handshake forward and brings it down to 470 ms. HTTP/2 multiplexing really shines when there are dozens of resources, where HTTP/1.1’s 6-connection ceiling starts serializing requests. And note this is only the first visit; with a good Cache-Control, the second one does not even touch the network for these images.

Conclusion and best practices

Going back to the original finding, these were the fixes that applied to that real case:

Add Cache-Control to the images. It is the highest-impact change and the one that was missing. For assets with a hash in the name: cache-control: public, max-age=31536000, immutable. This targets the layer Keep-Alive never touched: repeat visits.
Move the origin to HTTP/2 (or HTTP/3). Multiplexing removes application-level Head-of-Line blocking and makes Keep-Alive irrelevant. If you still see that header, you are still on HTTP/1.1.
Add preconnect to the assets domain, with the correct crossorigin, to get ahead of the TCP/TLS handshake. Limit it to the critical origins.
And the obvious one: optimize the images. 608 KB and 403 KB are too much for two images. Modern formats¹ (AVIF, WebP) and correct sizing cut those bytes before the network even comes into play. The best request is the one you never make, and the best bytes are the ones you never send.

We will see Keep-Alive less and less: with HTTP/2 and HTTP/3 as the norm, the header does not even apply. But we still run into it in old applications and in infrastructure in countries with limited resources, so it is worth knowing how to read it when it shows up in an audit.

Keep-Alive and Cache-Control do not compete: they work on different layers. Once you understand that one manages the pipe and the other the data, you stop confusing a reused connection with a cached resource, and you start optimizing both at once.

Notes

I call them “modern formats” for consistency with what Lighthouse reports in its tests, but WebP was released on September 30, 2010 and AVIF on February 19, 2019. To me, a modern format is JPEG-XL, which I want to write a dedicated post to analyze in depth.