In the previous post, we explored the usefulness of the Last-Modified
Response Header and If-Modified-Since
Request Header. They work really well when dealing with an endpoint returning a file.
What about data retrieved from a database or assembled from different sources?
Request | Response | Value example |
---|---|---|
Last-Modified |
If-Modified-Since |
Thu, 15 Nov 2023 19:18:46 GMT |
ETag |
If-None-Match |
75e7b6f64078bb53b7aaab5c457de56f |
Also here, we have a tuple of headers. One must be provided by the requester (ETag
), while the other is returned by the sender (If-None-Match
). The value is a hash generated on the content of the response.
If you want to go directly to using headers, go to the endpoint. Otherwise, observe (but don't spend too much time on) the implementation.
Preparation
For simplicity, we use an in-memory DB. It is exposed via the endpoint /db
. It contains a list of posts
. Each post contains a title
and a tag
. Posts can be added via POST
, and modified via PATCH
.
Retrieval is via a GET
function, which optionally filters by tag
.
src/db.mjs
import { getJSONBody } from "./utils.mjs";
const POSTS = [
{ title: "Caching", tag: "code" },
{ title: "Headers", tag: "code" },
{ title: "Dogs", tag: "animals" },
];
export function GET(tag) {
let posts = POSTS;
if (tag) posts = posts.filter((post) => post.tag === tag);
return posts;
}
export default async function db(req, res) {
switch (req.method) {
case "POST": {
const [body, err] = await getJSONBody(req);
if (err) {
res.writeHead(500).end("Something went wrong");
return;
}
POSTS.push(body);
res.writeHead(201).end();
return;
}
case "PATCH":
const [body, err] = await getJSONBody(req);
if (err) {
res.writeHead(500).end("Something went wrong");
return;
}
POSTS.at(body.index).title = body.title;
res.writeHead(200).end();
return;
}
}
src/utils.mjs
export function getURL(req) {
return new URL(req.url, `http://${req.headers.host}`);
}
export async function getJSONBody(req) {
return new Promise((resolve) => {
let body = "";
req.on("data", (chunk) => (body += chunk));
req.on("error", (err) => resolve([null, err]));
req.on("end", () => resolve([JSON.parse(body), null]));
});
}
Endpoint
By registering the db, we will be able to modify the content of the responses in real-time, appreciating the usefulness of ETag.
Also, let's register and create the /only-etag
endpoint.
// src/index.mjs
import { createServer } from "http";
import db from ".src/db.mjs";
import onlyETag from "./src/only-etag.mjs";
import { getURL } from "./src/utils.mjs";
createServer(async (req, res) => {
switch (getURL(req).pathname) {
case "/only-etag":
return await onlyETag(req, res);
case "/db":
return await db(req, res);
}
}).listen(8000, "127.0.0.1", () =>
console.info("Exposed on http://127.0.0.1:8000")
);
The onlyETag
endpoint accepts an optional query parameter tag
. If present, it is used to filter the retrieved posts.
Thus, the template is loaded in memory.
When submitted, the form uses as src/views/posts.html
<html>
<body>
<h1>Tag: %TAG%</h1>
<ul>%POSTS%</ul>
<form method="GET">
<input type="text" name="tag" id="tag" autofocus />
<input type="submit" value="filter" />
</form>
</body>
</html>
action
the current route (/only-etag
) appending as query parameter the name
attribute. For example, typing code
in the input and submitting the form would result in GET /only-etag?name=code
), No JavaScript required!
And the posts are injected into it.
import * as db from "./db.mjs";
import { getURL, getView, createETag } from "./utils.mjs";
export default async (req, res) => {
res.setHeader("Content-Type", "text/html");
const tag = getURL(req).searchParams.get("tag");
const posts = await db.GET(tag);
let [html, errView] = await getView("posts");
if (errView) {
res.writeHead(500).end("Internal Server Error");
return;
}
html = html.replace("%TAG%", tag ?? "all");
html = html.replace(
"%POSTS%",
posts.map((post) => `<li>${post.title}</li>`).join("\n")
);
res.setHeader("ETag", createETag(html));
res.writeHead(200).end(html);
};
As you notice, before dispatching the response, the ETag is generated and included under the ETag
Response header.
// src/utils.mjs
import { createHash } from "crypto";
export function createETag(resource) {
return createHash("md5").update(resource).digest("hex");
}
Changing the content of the resource changes the Entity Tag.
Performing the request from the browser you can inspect the Response Headers via the Network tab of the Developer Tools.
HTTP/1.1 200 OK
Content-Type: text/html
ETag: 4775245bd90ebbda2a81ccdd84da72b3
If you refresh the page, you'll notice the browser adding the If-None-Match
header to the request. The value corresponds of course to the one it received before.
GET /only-etag HTTP/1.1
If-None-Match: 4775245bd90ebbda2a81ccdd84da72b3
As seen in the previous posts per Last-Modified
and If-Modified-Since
, let's instruct the endpoint to deal with If-None-Match
.
export default async (req, res) => {
res.setHeader("Content-Type", "text/html");
retrieve (filtered) posts; // as seen before
load html; // as seen before
fill template; // as seen before
const etag = createETag(html);
res.setHeader("ETag", etag);
const ifNoneMatch = new Headers(req.headers).get("If-None-Match");
if (ifNoneMatch === etag) {
res.writeHead(304).end();
return;
}
res.writeHead(200).end(html);
};
Indeed, subsequent requests on the same resource return 304 Not Modified
, instructing the browser to use previously stored resources. Let's request:
-
/only-etag
three times in a row; -
/only-etag?tag=code
twice; -
/only-etag?tag=animals
twice; -
/only-etag
, without tag, once again;
The presence of the query parameter determines a change in response, thus in ETag.
Notice the last one. It does not matter that there have been other requests in the meantime; the browser keeps a map of requests (including the query parameters) and ETags.
Detect entity change
To further underscore the significance of this feature, let's add a new post to the DB from another process.
curl -X POST http://127.0.0.1:8000/db \
-d '{ "title": "ETag", "tag": "code" }'
And request again /only-etag?tag=code
.
After the db has been updated, the same request generated a different ETag. Thus, the server sent the client a new version of the resource, with a newly generated ETag. Subsequent requests will fall back to the expected behavior.
The same happens if we modify an element of the response.
curl -X PATCH http://127.0.0.1:8000/db \
-d '{ "title": "Amazing Caching", "index": 0 }'
While ETag is a more versatile solution, applicable regardless of the data type since it is content-based, it should be considered that the server must still retrieve and assemble the response, then pass it into the hashing function and compare it with the received value.
Thanks to another header, Cache-Control
, it is possible to optimize the number of requests the server has to process.
Top comments (2)
Hey man, your explanation about how to use ETag has really helped me to clear some doubts. Thanks for sharing it
I am very glad to read this. Appreciate it man!