Introduction
I continue the exploration of authentication methods, after the article about session-based authentication, by focusing on the role of tokens and highlighting their advantages over session IDs in certain scenarios. While the term "token" might sound familiar to you, let's delve into their evolution and specific challenges they address where session IDs prove to be less valuable.
The topics I'll cover provide a basic foundation for authentication systems that use JWTs:
- what are the benefits of such method compared to session IDs
- what is the structure of a JWT
- what is required to create a token
- how to send it to the client and how can the client store it
- how to pass the token within requests
- how the server verifies the token and send the response
Evolution
I want to describe a scenario to illustrate where session cookies become harder to handle.
Let’s imagine an e-commerce application. In its early days everything ran on one monolithic server, where both frontend and backend were hosted. The server had a database to store user sessions and for every user a session cookie was created. So far, the cookie mechanism worked smoothly.
The application gained popularity and the traffic encountered huge spikes. To fix that, load balancers were introduced to distribute requests across multiple servers.
But what if a user’s subsequent requests landed on a different server that doesn’t have their session data? A workaround was developed: to have “sticky sessions”, where the load balancer routed a user’s requests to the same server, but that reduced scaling flexibility, which was a major drawback.
Another improvement mechanism was added: horizontal scaling with microservices. That implied decomposing parts of the application into smaller units running on their own servers. Below is a simplistic representation of microservices.
But how do these servers coordinate the user sessions? Possible solutions could be to replicate the session data across all servers (which adds overhead, synchronization delays, or potential inconsistencies), or to introduce a single dedicated session storage database (like Redis - this creates a single point of failure and adds network latency when multiple servers fetch session data). It looks that the complexity starts to increase a lot, which leads to reduced maintainability of the system.
What if we were able to decentralize the session data, so each server would have a mechanism to validate the authentication information, without the need of a central session store? This leads to the introduction of web tokens.
Concept
What’s the idea behind these tokens? Unlike session IDs, which are stored by the server along with the user data, the tokens themselves contain the user data (e.g. user ID, username, email, roles, etc.), are generated by the server, stored by each client, and passed within the requests.
Tokens are cryptographically signed and all servers in the entire system are able to decode them and extract the user information by using a shared key (I’ll cover how this mechanism works later in the article).
Benefits
Let’s explore some general benefits this concept brings:
- Scaling can be easily achieved, each added server can independently validate the tokens.
- Seamless integration with RESTful APIs, due to their stateless nature. This makes tokens a good fit for integration with mobile applications.
- Tokens can be sent across different domains, they are a good fit for distributed systems
I want to mention the main practical benefit in the context of the previous scenario with the e-commerce application: the horizontal scaling that involves multiple servers or microservices can be done with minimal effort; the newly added servers would need to know the shared key only to validate the tokens.
Internals
On the surface, the web tokens look like a robust system with significant benefits. Ever wondered how they actually work? In the article about session-based authentication I talked about session IDs which utilize the browser cookies, but what are the web tokens more specifically, how are they passed with the request, and how do servers validate them?
Note: I’ve used the general term web token, but in the context of authentication the JSON Web Tokens (JWT) are the dominant and widely supported standard and I’ll refer exclusively to them throughout the entire article. More details about JWT can be found at https://jwt.io/introduction or https://datatracker.ietf.org/doc/html/rfc7519.
Similar to session IDs, JWTs go through a similar process of creation, storage on the client side, and sending them with subsequent requests. The difference lies in their implementation which I'll describe in the following sections.
The structure of a JWT
While session IDs are random characters, a JWT contains information about the user and there is no need on the server side for a storage mechanism of all generated tokens; each client stores their own JWT and servers will decode and extract the user information from it. This makes JWT systems stateless.
Basically, a JWT represents a Base64 encoded string composed of 3 parts: Header (Metadata), Payload (Data), and Signature (Verification).
Example
Let's consider the JWT taken from https://jwt.io/
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
. As you may notice, the separator is represented by the dot .
symbol.
By decoding the first two parts, we get JSON objects:
- Header:
{ "alg": "HS256", "typ": "JWT" }
- Payload:
{ "sub": "1234567890", "name": "John Doe", "iat": 1516239022 }
Let's see why do we need such a structure and what are the problems to be addressed: data integrity, authentication of the issuer, statelessness and scalability, transmission.
Data Integrity
Problem: We need to ensure that the information in a token hasn't been altered by anyone else.
Solution: The signature part is a cryptographic hash of the header and payload, created using a secret key. When a server (which knows the secret key) receives a token, it computes a hash based on the header, payload and the secret key, and it expects to match the signature part from the token.
Authentication of the Issuer
Problem: How do we know that the token was indeed created by our authentication server?
Solution: A secret key, which is available to our servers only, is used to generate the signatures, so no one else would be able to generate valid signatures.
Statelessness and Scalability
Problem: How do we handle scalability?
Solution: As mentioned above, the token itself contains user information in the payload part, and there is no need for servers to store any session data, because they can immediately extract the user information from the token. This way the servers can deal with a large number of users without using additional resources.
Transmission
Problem: How can we efficiently transmit tokens in HTTP headers or query parameters?
Solution: Using Base64 encoding. More information about this topic can be found at https://base64.guru/learn/what-is-base64.
Creation of tokens
As in the case of session IDs, the JWTs are created during the authentication process, after the server validates the user’s credentials. One important aspect I need to mention here, compared to the previous solution, is that the login endpoint could be located anywhere, no matter the domain.
Let's see what do we need at minimum to create a valid JWT for authentication:
-
Header
- Algorithm (alg): the algorithm used to sign the token (e.g. HS256 or RS256)
- Type (typ), usually set to "JWT"
-
Payload
- Subject (sub): used to identify the user, e.g. user ID or email
- Expiration time (exp): The timestamp after which the token becomes invalid
- Issued At Time (iat): The timestamp of when the token was issued.
- Secret Key: A secret key known only to the server is used to sign the token. This is crucial for ensuring the token's authenticity and integrity.
Here is a simple implementation for generating a JWT in Node.JS:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VySWQiOjEyMywidXNlcm5hbWUiOiJqb2huZG9lIiwiZW1haWwiOiJqb2huLmRvZUBleGFtcGxlLmNvbSIsImlhdCI6MTYzMzMyOTU2MywiZXhwIjoxNjMzMzMzMTYzfQ.v6nGb_QkO_w1x_88r81176tXF699eQz5555555555555
. If you go to https://jwt.io/ and decode this token, you'll see the payload with data from the source code, along with other 2 keys, iat and exp, which were automatically generated by the library. You may notice that I've added the username and the email in the payload, and kept it flat, not nested, which is the recommendation for such cases.
Note: The token can be decoded independently of the secret key which is used for verifying the token, not to extract its content.
Transmission of the token between the server and the client
Now that the token is created after a successful authentication, the server needs to send it to the client. Later on, when the client needs to make an authenticated request, it will also send the token to that specific server.
There are two common options of sending the token from the server to the client: via cookies or in the response body. There is no one-size-fits-all solution here, the right method depends on the application's requirements and security considerations. Depending on each method, the client will then send the token with subsequent requests to servers in different ways, which we'll explore soon.
Transmitting the token as a cookie
This process is similar to the one described in the session-based authentication article:
Set-Cookie: access_token=<JWT_value>; HttpOnly; Secure; SameSite=Strict
Some benefits of this methods include:
- automatic transmission of the cookie that contains the JWT in subsequent requests to the same domain;
- using the HttpOnly protection to mitigate Cross-Site Scripting (XSS) attacks If you have servers located on different domains, additional CORS configuration is needed on both the server and the client (that is, the browser).
On the server side, the following two response headers need to be defined:
-
Access-Control-Allow-Origin: <our_application_domain>
- this header tells the browser that the server accepts requests from our application domain (by default, browsers block requests to different domains unless servers are configured properly) -
Access-Control-Allow-Credentials: true
- as another security measure, browsers won't send any cookies with cross-origin requests; when this header is set, the browser will ignore this restriction and will be able to send the cookies associated with that specific domain.
Here is a simple implementation of creating and sending the token as a cookie from the server to the client, considering our application makes cross-site requests:
There is one additional setting to be done on the client side too: even if the server is configured to allow the browser to send cookies, we need to explicitly instruct our HTTP API (which could be the
fetch
method, XMLHttpRequest
object or HttpClient
from Angular) to include the cookies when making cross origin requests.
Here are some basic examples using each of these three APIs:
Now let's see how a server could define a protected endpoint and read the token from the cookies:
Sending the token in the response body
This method is simple to implement, more flexible, independent of server domains, suitable for SPAs or mobile apps. On the other hand, the client needs to explicitly extract and store the token. Popular applications like Twitter, Facebook, Netflix, Spotify and many others use this approach.
In this section we'll explore:
- how the token is generated and sent to the client
- how the client can store the token
- methods of sending the token in subsequent requests
A simple implementation of how the token is generated and sent to the client could look like this:
Storing the token
Once the server responds with the token, it is the client's responsibility to store it. The fundamental principle of token-based (or session-based) authentication is that the token should be kept secret and not shared with anyone. If an attacker could steal the token, the server won't know that the request wasn't made by the intended client, and this would lead to unauthorized access to sensitive data.
With this in mind, let's explore some options and their trade-offs:
- Using JavaScript variables
- Pros:
- it's easy and straightforward to implement
- Cons:
- could be vulnerable to XSS attacks (malicious scripts could access the variable), even though Angular provides a significant degree of protection against such attacks.
- the token is lost when the page is refreshed or closed.
- Pros:
-
localStorage
- Pros:
- the token persists after the page is reloaded or closed. can be easily accessed
- Cons:
- could be vulnerable to XSS attacks (malicious scripts could access the variable), even though Angular provides a significant degree of protection against such attacks - basically it's the same concern as with normal in-memory variables.
- Pros:
-
sessionStorage
- the same as localStorage, excepting that the token is lost when the page is closed (only when closed, not when reloaded)
Here is a simple implementation of an Angular login component, that stores the JWT in the localStorage:
Sending the token to servers
So far we have the JWT stored on the client side in order to use it when we need to make authenticated requests. But how can we send it in requests? While there could be various options, the RFC 6750 standard specifies that the token should be sent in the request header, like this Authorization: Bearer <token>
. This is the most widely used and recommended method, it’s well supported by libraries and frameworks, and works across different domains (no CORS concerns).
A simple implementation of this method is illustrated below:
In a real application a more elegant solution is used, that is, to define an HTTP interceptor that automatically adds the authorization header, but only for the requests that need it:
Token extraction and validation
Now that I’ve discussed the token transmission to servers, let’s see how the servers handle the received JWT. There are several steps involved in this process:
- token extraction
- signature verification
- payload decoding
- expiration check
- authorization
- response
Token extraction
The server first extracts the JWT from the incoming request header (e.g. Authorization header) or from the cookie, depending on the initial implementation.
Here are some basic implementations:
app.get('/protected', (req, res) => {
// Get the Authorization header value
const authHeader = req.headers.authorization;
// Split the header to get the token (format: "Bearer <token>")
const tokenParts = authHeader.split(' ');
const token = tokenParts[1];
//... the rest of the logic
});
app.get('/protected', (req, res) => {
// Get the token from the cookie, assuming the cookie name is 'token'
const token = req.cookies.token;
//... the rest of the logic
});
Signature verification
Then the server needs to verify the signature of the token. But why is this step necessary? In the section about the JWT structure I've briefly mentioned about the main roles of the signature:
- it ensures authenticity, proving that the JWT was indeed issued by our servers and not forged by an attacker.
- the signature is generated using a secret key that only our servers know; without the signature verification, an attacker could create fake JWTs.
- it is a cryptographic hash of the token's header and payload and any modification to either the header or payload would result in a different signature.
- JWTs are often used in distributed systems where multiple services rely on the token to authenticate and authorize users; by verifying the signature, each service can independently trust that the token is authentic and hasn't been tampered with, even if it was issued by a different service.
Even if the token can be decoded by anyone to extract the payload, the signature verification process ensures its authenticity and integrity. This is because the header and payload part of the token are simply encoded using the Base64 algorithm, and not encrypted. Both encoding and encryption transform data from one form to another, and this process can be reversed to recover the original data. The encryption relies on a secret key to transform back the data, making it unreadable without the correct key. Decoding typically uses a publicly known algorithm, making it easy to reverse the transformation without needing a secret key.
Let's see what are the steps involved in the signature verification process:
- Extract Header and Payload
- The JWT is a string with three parts separated by dots: header.payload.signature.
- The verifier splits the string at the dots to separate the header and payload segments.
- Base64Url Decode
- Both the header and payload are Base64Url decoded. This converts them from their URL-safe representation back into their original JSON format.
- The resulting JSON objects contain information about the algorithm (alg) and token type (typ) in the header, and the claims in the payload.
- Create Signing Input
- The Base64Url encoded header and payload strings are concatenated with a dot (.) in between.
- This concatenated string is the input that was used to create the signature during token generation.
- Signature Verification (Algorithm-Specific)
-
HMAC (Symmetric):
- The verifier takes the signing input string and applies the HMAC algorithm specified in the header (alg) - more details about it at https://www.okta.com/identity-101/hmac/.
- The shared secret key, which is stored on our servers, is used as the input to the HMAC function, along with the signing input string.
- The output of the HMAC function is the calculated signature.
- The verifier compares the calculated signature to the signature extracted from the JWT. If they match, the token is valid.
-
RSA (Asymmetric):
- The verifier retrieves the public key corresponding to the private key used by the issuer to sign the token (read the articles in the links below to find more information about public/private keys and the RSA algorithm).
- The verifier uses the public key to decrypt the signature extracted from the JWT.
- The decryption process recovers the original hash value that was created by the issuer when signing the token.
- The verifier calculates the hash of the signing input string using the same algorithm (alg) as the issuer.
- The calculated hash is compared to the recovered hash. If they match, the token is valid.
-
HMAC (Symmetric):
More detailed explanation about the verification process can be found in the articles Understanding JWT Validation: A Practical Guide with Code Examples or Validating RSA signature for a JWS.
The secret key (for HMAC) or the private key (for RSA) is crucial for verification. If these keys are compromised, the entire system is at risk. The choice between HMAC and RSA depends on our specific security requirements. HMAC is simpler but requires sharing the secret key. RSA is more complex but allows for better key management. Most JWT libraries handle the cryptographic details of signature verification, so we don't need to implement the algorithms ourselves.
Additional checks can be performed:
- Expiration (exp): The verifier checks the exp claim (expiration time) in the payload to ensure the token hasn't expired.
- Not Before (nbf): The verifier might check the nbf claim (not before time) to ensure the token is not being used before its intended start time.
- Issuer (iss) and Audience (aud): The verifier might check the iss (issuer) and aud (audience) claims to ensure the token was issued by a trusted party and is intended for the current recipient.
- Other Claims: Depending on your application's requirements, you might perform additional checks on other claims in the payload.
If all the verification steps pass, the JWT is considered valid. Otherwise, the token is rejected as invalid or compromised.
In practice we don't need to manually implement the verification algorithm, but simply used dedicated libraries. In Node.JS, the verification could look like this:
const jwt = require('jsonwebtoken');
const decodedToken = jwt.verify(token, secretKey);
Payload decoding and response
Once the signature is confirmed valid, the Base64Url-encoded payload is decoded into its original JSON representation. This JSON object contains the claims (data) that were included in the token by the issuer. The claims typically include information about the user (e.g., user ID, username, email) and other relevant details (e.g., expiration time, issued at time, permissions). The server can use the claims in the token (e.g., user ID, permissions) to determine whether the user has access to the requested resource. If so, the server will process the request and will send the appropriate response; otherwise, it can respond with 401 or 403 status codes.
401 Unauthorized response
This status code is appropriate when the user's authentication credentials (the JWT in our case) are either missing, invalid, or expired:
- The Authorization header is missing or empty.
- The JWT has an invalid signature (e.g., due to tampering or an incorrect secret key).
- The JWT has expired (the exp claim is in the past).
403 Forbidden response
This status code is appropriate when the user is authenticated (the JWT is valid) but does not have the necessary permissions or authorization to access the requested resource. It indicates that the server understood the request but refuses to fulfill it due to insufficient privileges.
Basic implementation
Recap
The topics I've covered in this article provide a basic foundation for authentication systems that use JWT:
- what are the benefits of such method compared to session IDs
- what is the structure of a JWT
- what is required to create a token
- how to send it to the client and how can the client store it
- how to pass the token within requests
- how the server verifies the token and send the response
What's next?
As I prefer to explain and build things gradually, in the next article I'll cover more advanced topics about JWTs that are required by a robust authentication system, so it can be integrated into a large scale application. Such topics will include:
- token revocation to prevent unauthorized access if a token is compromised
- rate limiting to prevent brute-force attacks where an attacker tries to guess valid tokens
- refresh tokens to issue new access tokens when they expire, improving user experience and security
- encrypting the token if the payload contains sensitive information
- frameworks, protocols, and systems that use JWT: Single-Sign-On, OAuth, Auth0.
See it in action
Top comments (0)