Browsers
(also calledweb browsers
or anInternet browsers
) are software applications installed on our devices that allow us to access the Word Wide Web. You are actually using one while reading this text.
There are many browsers in use today and as of 2021, the most used ones were: Google Chrome
, Apple's Safari
, Microsoft Edge
and Firefox
.
But how do they actually work and what happens from the moment we type a web address into the address bar until the page we are trying to access gets displayed on our screen?
An over simplified verison of this would be that:
when we request a web page from a particular website, the browser retrieves the necessary content from a web server and then displays the page on our device.
Pretty straight-forward, right? Yes, but there's more involved into this seemingly super simple process. In this series we are going to talk about the navigation
, fetching the data
, parsing
and rendering
steps and hope to make these concepts clearer to you.
1. NAVIGATION
Navigation is the first step in loading a web page. It refers to the process that happens when the user is requesting
a web page either by clicking on a link
, writing a web address in the browser's address bar
, submitting a form
etc.
DNS Lookup (resolving the web address)
The first step in navigating to a web page is finding where the assets for that page are located (HTML, CSS, Javascript and other kind of files). If we navigate to https://example.com, the HTML page is located on the server with IP address of 93.184.216.34 (for us, websites are domain names
but for computers they are IP adresses
). If we've never visited this site before, a Domain Name System (DNS) lookup must happen.
DNS servers are computer servers that contain a database of public IP addresses and their associated hostnames (this is commonly compared to a phonebook in that people's names are associated to a particular phone number). In most cases these servers serve to resolve or translate those names to IP addresses as requested (right now there are over 600 different DNS root servers distributed across the world).
So when we request a DNS lookup
, what we actually do is interogate one of these servers and ask to find out which IP address
coresponds to the https://example.com
name. If a corresponding IP is found, it is returned. If something happens and the lookup is not successful, we'll see some kind of error message in the browser.
After this initial lookup, the IP address will probably be cached for a while, so next visits on the same website will happen faster since there's no need for a DNS lookup (remember, a DNS lookup only happens the first time we visit a website).
TCP (Transmission Control Protocol) Handshake
Once the web browser knows the IP address of the website, it will try and set up a connection to the server holding the resources, via a TCP three-way handshake
(also called SYN-SYN-ACK
, or more accurately SYN, SYN-ACK, ACK
, because there are three messages transmitted by TCP to negotiate and start a TCP session between two computers).
TCP stands for Transmission Control Protocol, a communications standard that enables application programs and computing devices to exchange messages over a network. It is designed to send packets (of data) across the Internet and ensure the successful delivery of data and messages over networks.
The TCP Handshake is a mechanism designed so that two entities (in our case the browser and the server) that want to pass information back and forth to each other can negotiate the parameters of the connection before transmitting data.
So, if the browser and the server would be two people, the conversation between them would go something like:
The browser sends a SYNC
message to the server and asks for SYNchronization (synchronization means the connection).
The server will then reply with a SYNC-ACK
message (SYNChronization and ACKnowledgement):
In the last step, the browser will reply with an ACK
message.
Now that the TCP connection (a two way connection) has been established through the 3 way handshake
, the TLS negotiation
can begin.
TLS negotiation
For secure connections established over HTTPS, anotherhandshake
is needed. This handshake (TLS negotiation) determines which cipher will be used to encrypt the communication, verifies the server and establishes that a secure connection is in place before beginning the actual transfer of data.
Transport Layer Security (TLS), the successor of the now-deprecated Secure Sockets Layer (SSL), is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email and instant messaging but its use in securing HTTPS remains the most publicly visible. Since applications can communicate either with or without TLS (or SSL), it is necessary for the client (browser) to request that the server sets up a TLS connection.
During this step, some more message are exchanged between the browser and the server.
-
Client says hello. The browser sends the server a message that includes which TLS version and cipher suite it supports and a string of random bytes known as the
client random
. -
Server hello message and certificate. The server sends a message back containing the server's SSL certificate, the server's chosen cipher suite and the
server random
, another random string of bytes that's generated by the server. - Authentication. The browser verifies the server's SSL certificate with the certificate authority that issued it. This way the browser can be sure that the server is who it says it is.
-
The premaster secret. The broswer sends one more random string of bytes called the
premaster secret
, which is encrypted with apublic key
which the browser takes from theSSL certificate
from the server. Thepremaster secret
can only be decripted with the private key by the server. -
Private key used. The server decrypts the
premaster secret
. - Session keys created. The browser and server generate session keys from the client random, the server random and the premaster secret.
- Client finished. The browser sends a message to the server saying it has finished.
- Server finished. The server sends a message to the browser saying it has also finished.
- Secure symmetric encryption achieved. The handshake is completed and communication can continue using the session keys.
Now requesting and receiving data from the server can begin.
Refrence materials:
Top comments (19)
Very useful and clearly introducing. Thanks for sharing.
I am glad it helped :).
Thanks for writing this series, knowing the basics is necessary for sure.
I wish our teachers explained us this way.
I'm glad you found it useful.
Thank you for sharing!
Glad I could write it π
Please, go forward! A hug from Brazil!
Excellent post.
Very informative.
Thank you Ryan.
Wow..nicely explained!
Nice simplified description, thanks !!
Glad to hear it was of help :).
Thank you Leonid π!
Awesome series. Love it.
And then Junior Developer comes along and says:
"All browsers are basically the same nowadays. If it works in Chrome, it works in all of them, because I'm using React."
Wait, are you saying that a browser is more than just a JavaScript interpreter?
Shocking.
Thank you for sharing