Mainak Chattopadhyay for IEEE Computer Society, VIT Chennai

Posted on Jan 1, 2023 • Edited on Feb 3, 2023

Security Vulnerabilities and Prevention in HTML5

#webdev #beginners #javascript #html

The very basics of web development is HTML which provides a lot of functionalities to markup our webpages.

HTML5 has introduced some new features which make web pages richer. New features include new semantic elements like 'header', 'footer', etc., new attributes for form elements like date, time, range, etc., new graphic elements like SVG and canvas, and new multimedia elements like audio and video.

Hence , with increased functionality , the data flow has also increased leading to a possible data theft by attackers.

For example - An attacker can steal the data by inserting some wicked code through HTML forms which will be kept in the database. Security flaws are possible if proper security measures are not taken when using HTML5 features like communication APIs, storage APIs, geolocation, sandboxed frames, offline applications, etc.

Let us explore HTML Security

As HTML applications are web-based applications, developers should take proper measures to safeguard the stored data and communications

The following is the list of a few vulnerabilities that are possible in HTML-->

HTML Injection
Clickjacking
HTML5 attributes and events vulnerabilities
Web Storage Vulnerability
Reverse Tabnabbing

HTML Injection

As the name suggest , the attacker injects a malicious piece of code for channeling the data.

There are two types of HTML Injection -

Stored HTML Injection

The malicious code injected by an attacker will get stored in the backend and will get executed whenever a user makes a call to that functionality.

Reflected HTML Injection

The malicious code will not get code stored in the webserver rather will be executed every time the user responds to the malicious code.

Best Practices to prevent HTML injection -

Use safe Javascript methods like innerText in place of innerHTML
Code Sanitization: Removing illegal characters from input and output refers to HTML code sanitization.
Output Encoding: Converting untrusted data into a safe form where data will be rendered to the user instead of getting executed. It converts special characters in input and output to entities form so that they cannot be executed. For example, < will be converted to "&lt" ; etc.,

Clickjacking

It is an attack where an attacker uses low iframes with low opaqueness or transparent layers to trick users into clicking on something somewhat diverse from what they actually see on the page.

Thus an attacker is hijacking clicks which will execute some malicious code and hence the name 'Clickjacking'.
It is also known as UI redressing or iframe overlay.

For example,
on a social networking website, a clickjacking attack leads to an unauthorized user spamming the entire network of your friends by sending some false messages.

There are two ways to prevent Clickjacking -->

Client-side methods: The most common method is to prevent the webpages from being displayed within a frame which is known as frame-buster or frame-killer.
Though this method is effective in a few cases it is not considered a best practice as it can be easily bypassed.
Server-side methods: Security experts recommend server-side methods to be the most effective methods to defend against clickjacking. Below are the two response headers to deal with this.

Using X-Frame-Options response header.
Using Content Security Policy(CSP) response header.

Note - We would talk about response headers in details in later blogs.

HTML5 Attributes & Events Vulnerabilities

HTML5 has few tags, attributes, and events that are prone to different attacks as they can execute Javascript code. These will be vulnerable to XSS(Cross - site scripting) and CSRF(Cross-Site Request Forgery) attacks.

Examples-

1.Malicious script injection via formaction attribute



<form id="form1" />
<button form="form1" formaction="javascript:alert(1)">Submit</button>

In the above code snippet, the malicious script can be injected in formaction attribute. To prevent this, users should not be allowed to submit forms with form and formaction attributes or transform them into non-working attributes.

2.Malicious script injection via an onfocus event



<input type="text" autofocus onfocus="alert('hacked')"/>

This will automatically get focus and then executes the script injected. To prevent this, markup elements should not contain autofocus attributes.

3.Malicious script injection via an onerror event in the video-tag



<video src="/apis/authContent/content-store/Infosys/Infosys_Ltd/Public/lex_auth_012782317766025216289/web-hosted/assets/temp.mp3" onerror="alert('hacked')"></video>

This code will run the script injected if the given source file is not available. So, we should not use event handlers in audio and video tags as these are prone to attacks.

Lets us take a look into

HTML Sanitization

HTML Sanitization provides protection from a few vulnerabilities like XSS(Cross-site scripting) by replacing HTML tags with safe tags or HTML entities.

The tags such as ,,,,, which are used for changing fonts are often allowed. The sanitization process removes advanced tags like <script> <embed>,<object> and <link>.

This process also removes potentially dangerous attributes like 'onclick' attribute in order to prevent malicious code injection into the application.

Entity names for some HTML characters

When a web browser finds these entities, they will not be executed. But instead, they will be converted back to HTML tags and printed.

Example -

Consider the scenario that an attacker injects the below HTML code into a web page.



<a href="#" onmouseover="alert('hacked')">Avengers</a>

On using HTML sanitization, the response will be as below.



&lt;a href="#" onmouseover="alert('hacked')"&gt; Avengers &lt;/a&gt;

This code will not be executed instead of stored as plain text in the response.

There are many sanitizer libraries available to do this job. Some of the commonly used libraries are DOMPurify, XSS, and XSS-filters.

Local Storage Vulnerabilities

In our web applications, we often store some data in the browser cache. As the data is stored at the client-side, there is a chance of data-stealing by injecting some malicious code, if no proper care is taken. Let us now see how to store the data properly to prevent such attacks.

HTML5 has introduced Web storage or offline storage which deals with storing data in a local cache. Data can be stored using two types of objects in HTML5. Local storage and Session storage. These storages hold data in the form of key-value pairs.

Local storage holds the data in the browser cache until the user deletes it or it expires based on the expiry date given. setItem() method is used to assign data to local storage.

The below code creates three items with names bgcolor, textcolor, fontsize and assigns the values to them.



localStorage.setItem("bgcolor", document.getElementById("bgcolor").value);
localStorage.setItem("textcolor", document.getElementById("textcolor").value);
localStorage.setItem("fontsize", document.getElementById("fontsize").value);

Users can view the storage data in the browser by pressing F12 as shown below:

Similarly, session storage holds the data until the session ends or the browser/tab is closed.

An attacker can inject some malicious code and can steal the data stored here. So we should always ensure that sensitive information is not stored at the client side.

Preventive measure -

Use cookies with the 'httponly' flag to protect the data stored at the client-side

Let us get an overview of another type of possible attack

Reverse Tabnabbing

We would try to understand this with the help an example -

Consider a message forum or a blog where an attacker can post his own website link. If any user visits that link will be shown some information but in the background that malicious website will redirect the parent login page to a fake page that looks similar to the original login page.

When a user comes back to the message forum, they appear to be logged out. Without thinking they will enter their credentials to log in as the page looks similar to the original one. Now the attacker can get hold of that authentication data. Now the user will be redirected to the message forum page automatically so that they won't get a doubt that they have entered credentials in a fake login page

DEV Community

Security Vulnerabilities and Prevention in HTML5

Let us explore HTML Security

HTML Injection

Clickjacking

HTML5 Attributes & Events Vulnerabilities

HTML Sanitization

Local Storage Vulnerabilities

Reverse Tabnabbing

Top comments (0)

Read next

👀 How Check Memory Leaks in React?⚠️🚨🚨

Built a cli for browser

Learn How To Build A Translator App With API Using HTML, CSS, And JavaScript

10 Cool Ideas for Discord Bots You Can Build Today