When developing web applications that take input, we need to ensure that the user is not providing any illegal string. Each input field needs proper validation implemented on it. Otherwise, it can fall prey to attacks like SQL injection, cross-site scripting (XSS), etc.
The developers often use regular expressions to clean all the HTML code from the user input. Although it is an excellent way to get the job done, there is still a possibility that you might end up with some HTML code or other executable illegal lines of code.
The browser provides many web APIs to developers for different purposes. Among them, the HTML Sanitizer API is used for sanitizing the input string. Let’s briefly look at it.
The API allows the user to reduce the risk of DOM-based cross-site scripting attacks. It achieves this by providing developers with different methods to handle user-controlled HTML, preventing direct script execution upon injection. It is an experimental API that still lacks the support of major web browsers.
The HTML Sanitizer API also allows developers to override the default elements and attributes. It also makes the HTML output safe for use within the current user agent.
This API exposes three methods to the developers. Let’s quickly look at each of them.
You can use this method to sanitize the HTML string and then insert it into the DOM as a child of the current element. When adding HTML, the setHTML
method should be used instead of the innerHTML
to add untrusted data.
This method will sanitize the string for you by removing all the HTML or executable scripts for later insertion in the DOM.
This method is used to sanitize the data in a Document, DocumentFragment.