Hope you read about What browsers are doing when you make a request for a resource I wrote the other day.
Today, we are going to continue this series with more about browsers and how they know how to present websites.
How do browsers know what a web page should look like?
The browser knows what to display back to the user though its many components/engines that interpret how the resources are supposed to look and display it within the HTML and CSS specifications to the user.
Who regulates these specifications?
Browser developers have a lot of freedom to make a browser look, act, and display things however they want.
There is no "hard and fast set of rules" anywhere that dictates that a URL bar must be at the top of the screen or that an HTTP request must use port 80.
There are no rules that a browser must communicate with specific protocols in a specific way.
There is no overseeing group that controls any of the specifications on the web.
In fact, you don't need to use a browser to access the Internet at all. You could use the Command Line interface and make requests yourself for information and use another program to display the information as your desire.
It's all "best practices" and following what other companies are doing because it's easier to communicate when everyone if speaking the same language.
That being said, there are "best practice" specifications that are maintained W3C (World Wide Web Consortium) organization, which maintains the standards for the web as it relates to HTML, CSS, and the DOM (Document Object Model). There are also "official" standards on how browsers communicate over networks that's maintained by the RFC (Request for Comments). The RFC developed to be the principal technical development and standards-setting body for the Internet and hold the belief that the specified protocol or service provides significant benefit to the Internet community. (These specifications are very dry and not always easy to read but they are full of a lot of good information.)
There are a few others authorities for specific languages, Unicode, and the "living standard" of the web from WHATWG (Web Hypertext Application Technology Working Group), but the RFC and W3C are the main two for the web that are usually spoken about.
So all in all, even though there are specifications and standards for the Internet they say things like "Typically, user agents" when speaking on something a browser will typically do, which clearly leaves room for interpretation.
How are browsers built?
Because of this lack of regulation, a loose guide of "best practices" developed over the years as developers wanted to keep experiences and accessibility the same-ish across all devices for each user. (Basically, they were all copying each other as they were/are fighting to be the #1 browser used.) This also lead way to the Browser Wars as each browser is trying to be better than the others. Check out this amazing graphic on Wikipedia about the browsers that have been developed and what stage they are currently in. (6)
That being said, each browser will have their own flavor added in, but they all have a lot of the same basic components:
- User Interface
- Browser Engine
- Rendering Engine
- Networking Platform
- UI Backend
- JavaScript Interpreter
- Data Storage
User Interface (IR):
The browser's User Interface is what we, as a user, interact with; and though there are a bunch of browsers out on the market they generally have a lot in common with each other.
All browsers usually have:
- An address bar
- A menu bar
- Back/forward buttons
- Bookmarking options
- Status bar
- Refresh/stop buttons for refreshing/stopping the loading of current resources
- Home button that takes you to your specific home page
- A display window where the web page is displayed
Browser Engine:
The Browser Engine works as a bridge between the UI and the next layer, the Rendering Engine.
The Browser Engine takes inputs from the UI and uses them to query any HTML documents and other resources. It then uses them to create an interactive visual representation (or DOM) on the browser window by manipulating the Rendering Engine.
Rendering Engine:
The Rendering Engine is responsible for interpreting the HTML and CSS and displaying the generated content on the browser window.
However, since all browsers have different underlying rendering engine, the way which they render a website or webpage is completely different. That’s why you might sometimes face incompatibility issues with some browsers.
Networking:
The Networking component handles all network calls, such as HTTP or FTP requests, and implements a cache of retrieved documents to minimize overall network traffic. It is a platform independent interface and has specific underneath implementations for each platform.
UI Backend:
The UI Backend is used for drawing basic widgets; such as a select box, input box, check box, and windows.
This backend uses the operating system's user interface methods underneath to expose a generic interface that is not platform specific.
JavaScript Engine:
The JavaScript Engine is used to parse and execute JavaScript code and give the result to the Rendering Engine.
There are a few options for this and each of the main 4 browsers has their own preference. Firefox uses Spider Monkey, Internet Explorer/IE Edge uses Chakra, Chrome uses V8, Safari uses Nitro.
Data Storage:
The Data Storage is a layer where browsers store data locally to manage user data such as cache, cookies, bookmarks and preferences in their supported storage mechanisms (or small database), such as localStorage, IndexedDB, WebSQL and FileSystem.
How do the components work together?
This is where the magic happens.
Once the user requests a web site from the browser the Networking component will start sending requests for the resources/documents and sending them to the Rendering Engine.
The Rendering Engine then parses chunks of the HTML document and converts the elements into DOM nodes and CSS files in "style elements". It then uses the DOM nodes to construct a tree called the “content tree” or the “DOM tree”.
Once the DOM tree is constructed another tree is created, called the "render tree", that is a visual representation of the document in the order that it will be displayed. ("The purpose of this tree is to enable painting the contents in their correct order. Firefox calls the elements in the render tree “frames”. WebKit uses the term renderer or render object."(5))
Now that the renter tree is constructed it goes through a "layout process". This process is where elements get position coordinates and the output is a "box model".
The final step is for the render tree to be traversed and each element to be filled in, or "painted", using the UI Backend component to display the content in the browser window.
Wrap up
As we dive deeper it's clear that Browsers have so much more going on then we realize. And to that, in my opinion, it's a good practice for Developers and Security Engineers alike to understand the flow of data in the browser to help things be presented faster to a user, interact with different parts of an application in a browser, or understand the proper way to store information securely in the browser.
Resources:
- https://en.wikipedia.org/wiki/Web_standards
- https://medium.com/@ramsunvtech/behind-browser-basics-part-1-b733e9f3c0e6
- https://developers.google.com/web/updates/2018/09/inside-browser-part1
- https://dzone.com/articles/how-browsers-work-behind
- https://medium.com/@monica1109/how-does-web-browsers-work-c95ad628a509
- https://en.wikipedia.org/wiki/Browser_wars
- https://www.html5rocks.com/en/tutorials/internals/howbrowserswork
Top comments (0)