Precise XSS detection and mitigation with Client-side Templates

05/15/2020 ∙ by Jose Carlos Pazos, et al. ∙ 0

We present XSnare, a fully client-side XSS solution, implemented as a Firefox extension. Our approach takes advantage of available previous knowledge of a web application's HTML template content, as well as the rich context available in the DOM to block XSS attacks. XSnare prevents XSS exploits by using a database of exploit descriptions, which are written with the help of previously recorded CVEs. CVEs for XSS are widely available and are one of the main ways to tackle zero-day exploits. XSnare effectively singles out potential injection points for exploits in the HTML and sanitizes content to prevent malicious payloads from appearing in the DOM. XSnare can protect application users before application developers release patches and before server operators apply them. We evaluated XSnare on 81 recent CVEs related to XSS attacks, and found that it defends against 94.2 XSnare is the first protection mechanism for XSS that is application-specific, and based on publicly available CVE information. We show that XSnare's specificity protects users against exploits which evade other, more generic, anti-XSS approaches. Our performance evaluation shows that our extension's overhead on web page loading time is less than 10



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

XSS is still one of the most dominant web vulnerabilities. A 2017 report showed that 50% of websites contained at least one XSS vulnerability [Acunetix]. Countermeasures exist, but many of them lack widespread deployment, and so web users are still mostly unprotected.

Informally, the cause of XSS is a lack of input sanitization: user-chosen data “escapes” into a page’s template and makes its way into the JavaScript engine, or modifies the DOM. Consequently, many of the XSS defenses published so far propose to fix the problem at the source, by properly separating the template from the user data on the server, or by modifying browsers [Jim:2007:DSI:1242572.1242654, Nadji:2009, Wurzinger:2009:SMX:1656360.1656379, Sundareswaran:2012:XHS:2352970.2352994]. There are also similar solutions that can be implemented in the front-end code of an application [10.1007/978-3-319-66399-9_7]. In all cases, these technologies must be adopted by the application software developers, otherwise users are left unprotected.

One barrier to adoption of existing XSS defenses is that developers may not have the necessary expertise, or sufficient resources, to use the approach. Luckily, users wishing to gain reassurance over the safety of the sites they visit can install browser extensions to filter malicious scripts and content. Unfortunately, these extensions achieve most of their security by disabling functionality in the applications, such as JavaScript, which impairs the usability of the sites [Noscript, Snyder:2017:MWD:3133956.3133966]. For example, most sites rely on JavaScript being enabled111As early as 2012, it is used by almost 100% of the Alexa top 500 sites [Stock:2017:WTI:3241189.3241265].

When an XSS vulnerability is disclosed, some software vendors respond with patches. If the affected software is released in the form of packages, frameworks, or libraries, and used by several web applications, there is delay before users can benefit from the patch. Most importantly, the patched software must be re-deployed by site administrators.

Unfortunately, website administrators will not, and often cannot, apply software updates immediately: one study found that 61% of WordPress websites were running a version with known security vulnerabilities [Sucuri]. In another report, we learn that 30.95% of Alexa’s top 1 Million sites run a vulnerable version of WordPress [wpwhitesecurity].

Users are in effect at the mercy of developers and administrators if they need to browse safe, up-to-date, applications. Our solution, XSnare, helps with this problem – based on information from past disclosures, XSnare patches known page vulnerabilities directly in the browser.

Figure 1: Different web security solutions with XSnare on the client-side.

Each layer of the web application stack (Figure 1) opens different defence options against XSS:

  1. The application logic is the first line of defence. Code safety can be enhanced with third-party vulnerability scanning solutions, and a thorough code-review process. Taint, and static code analysis tools can detect unsanitized inputs.

  2. In the hosting environment, network firewalls, specifically Web Application Firewalls can defend against attacks such as Distributed Denial-of-Service (DDoS), Structured Query Language (SQL) injections and XSS.

  3. In the client’s environment (residential or commercial), users may install network firewalls, network content filters, and web proxies.

  4. The last line of defence is the browser. Browser have built-in defences, such as Chrome’s XSS Auditor [xssauditor]. Users can also install third-party extensions to block malicious requests and responses, such as NoScript [Noscript], and XSnare.

We make two observations about existing solutions: (a) server-side solutions have to be applied independently on each server, and (b) solutions on the client are typically written as generic filters which attempt to catch everything, and consequently do not take full advantage of the specificity of the application or the vulnerability.

For example, a WAF can effectively protect the deployment placed behind it, but users cannot realistically expect that every site they visit be protected by one. At the opposite end, in the client’s environment, a user might configure a network proxy for all website traffic, with generic rules achieving maximum coverage, but this will often lead to an elevated rate of false positives (FPs).

Similarly, browser built-in defences are coarse-grained, and work on just a subset of exploits. Chrome’s XSS Auditor, for example, only attempts to defend against reflected XSS. Google recently announced its intention to deprecate XSS Auditor, for reasons including “Bypasses abound”, “It prevents some legit sites from working”, and “Once detected, there’s nothing good to do” [deprecatexssauditor]. Stock et al. [precise_dom_xss] propose enhancements to XSS Auditor and cover a wider range of exploits than the auditor, but are limited to DOM-based XSS. By contrast, our work covers all types of XSS.

Implementing adequate server-side protections [Xu:2006:TPE:1267336.1267345, DBLP:conf/sec/Nguyen-TuongGGSE05, Pietraszek:2005:DAI:2146257.2146267, Bisht:2008:XPD:1428322.1428325] throughout a codebase could arguably qualify as a colossal task, considering the high turnaround times for resolving simpler bugs. A 2018 study found that the average time to patch a Common Vulnerability and Exposures (CVE), all severities combined, is 38 days, increasing to as much as 54 days for low severity CVEs, and the oldest unpatched CVE was 340 days old [Rapid7].

Server-side defences also do not protect against client-only forms of XSS, e.g., reflected XSS, or persistent client-side XSS

, which use a browser’s local storage or cookies as an attack vector. Steffens et al. 

[DBLP:conf/ndss/SteffensRJS19] present a study of persistent client-side XSS across popular websites and find that as many as 21% of the most frequented web sites are vulnerable to these attacks. To provide users with the means to protect themselves in the absence of control over servers, we strongly believe that a novel client-side solution is necessary.

A number of existing solutions in this area also suffer from high rates of false-positives and false-negatives. For example, NoScript [Noscript] works via domain white-listing, thus by default, JavaScript scripts and other code will not execute. However, not all scripts outside of the whitelist should be assumed to be malicious. Browser-level filters like XSS Auditor work based on general policies and can therefore incorrectly sanitize non-malicious content.

We posit that the DOM is the right place to mitigate XSS attacks as it provides a full picture of the web application. While most of the functionality we provide could be done by a network filter in front of the browser, we take advantage of additional context provided by the browser. Particularly, when an exploit occurs as a result of user interactions, like on response to a click, we benefit from knowledge of the initiating tab to filter the response. Previous client-side solutions have opted for detectors that were generic and site-agnostic [Kirda:2009:CCS:2639535.2639808, Jim:2007:DSI:1242572.1242654, Hallaraker:2005:DMJ:1078029.1078861]. Our work goes in the opposite direction, and tries rather to prevent precisely-defined exploits in specific applications.

If a patch for a server-side vulnerability can be “translated” into an equivalent set of operations to apply on the fully formed HTML document in the browser, then we can seize the opportunity to defend early against exploits of that vulnerability. Our extension, which has access to the user’s browsing context, can identify vulnerable pages based on a database of signatures for previous disclosures. This way, XSnare can protect users as soon as a patch is implemented and added to its database. The client-side patch will remain beneficial until all server operators running that software have had a chance to upgrade their deployments.

A similar philosophy is adopted by the client-side firewall-based network proxy Noxes [Kirda:2009:CCS:2639535.2639808]. Unlike XSnare, however, Noxes only applies generic policies based on information available at the network layer. Namely, it does not protect against attacks invisible to the network, e.g., deleting local files. We believe additional contextual knowledge also offers more accurate vulnerability detection.

We evaluate XSnare by testing it on 81 recent XSS CVEs. We also report XSnare’s performance overhead on page load times across a wide range of sites and show that it does not significantly impact browsing experience.

To summarize, our contributions include:

  • XSnare: a novel client-side framework that protects users against XSS vulnerabilities with a database of signatures for these vulnerabilities, written in a declarative language.

  • A mechanism to correctly isolate a vulnerable injection point in a web page and to apply the intended server-side patch on the client-side.

  • A collection of signatures to protect users against real XSS CVEs (Section 5), demonstrating the practicality of XSnare; and the evaluation of its impact on browsing (Section 6).

2 XSnare Design

Figure 2: XSnare’s approach to protect against XSS.

We now present the design of XSnare and its components. We begin with a high-level view of its operation (see Figure 2): A user requests a page,, on a browser with the XSnare extension installed. The response may or may not contain malicious XSS payloads. Before the browser renders the document, XSnare analyzes the potentially malicious document. The extension loads signatures from its local database into its detector. The detector analyzes the HTML string arriving from the network, and identifies the signatures which apply to the document. These signatures specify one or more “injection spots” in the document, which correspond, roughly speaking, to regions of the DOM where improperly sanitized content could be injected. The extension’s sanitizer eliminates any malicious content and outputs a clean HTML document to the browser for rendering (Algorithm 1).

2.1 An example application of XSnare

To further explain our approach, we present a small example of how DOM context can be used to defend against XSS, taken from CVE 2018-10309 [examplecve]. This is reproducible in an off-the-shelf WordPress installation running the Responsive Cookie Consent plugin, v1.7. This is a stored XSS vulnerability, and as such is not caught by some generic client-side XSS filters, including Chrome’s XSS auditor.

Consider a website running PHP on the backend which stores user input from one user, and displays it later to another user, inside an input element.

The PHP code defines the static HTML template (in black), as well as the dynamic input (in red):

<input id="rcc_settings[border-size]"
type="text" value="<?php rcc_value(’border-size’); ?>"/>
<label class="description"

Normally, the input might have a value of "0":

<input id="rcc_settings[border-size]"
 type="text" value="0">
<label class="description"

However, the php code is vulnerable to an injection attack, e.g.:

border-size = ""><script>alert(’XSS’)</script>

The browser will render this, executing the injected script:

<input id="rcc_settings[border-size]"
type="text" value=""><script>alert(’XSS’)</script>
<label class="description"

Note that the resulting HTML is well-formed, so a mere syntactic check will not detect the malicious injection. Let us assume a security analyst knows the original template, i.e., without injected content. If the analyst were given a filled-in document, they could (in most cases) separate the injected content from the server-side template, and get rid of the malicious script entirely, using proper sanitization.

The injected script is bounded by template elements with identifiable attributes. Assuming (for now) that there is only one such vulnerable injection point, we can search for the input element from the top of the document, and the label from the bottom to ensnare the injection points in the HTML.

This shares goals with the client/server hybrid approach of Nadji et al. [Nadji:2009]. They automatically tag injected DOM elements on the server-side using a taint-tracking, so that the client (a modified browser) can reliably separate template vs injected content. We do not require any server-side modifications, but rather opt for a client-side tagging solution based on exploit definitions.

The injected content, once identified, must be sanitized appropriately. The appropriate action will depend on the application setting, but assuming a patch has been written, it suffices to translate the intention in the server code’s path to the client-side. This can be straightforward, once the fix is understood.

The developer incorrectly claimed the bug had been fixed in version 1.8 of the plugin. Other similar vulnerabilities had indeed been fixed, but not this one [rccpatch]. The built-in WordPress function sanitize\_text\_field needed to be applied.

XSnare does not automatically determine the relevant actions to implement from a patch. We assign this task to a security analyst, who will act as the signature developer for a given exploit. The system will however automate the signature matching and sanitization.

2.2 XSnare Signatures

Our signature definitions make two assumptions: first, an injection must have a start point and end point, that is, an element can only be injected between a specific HTML node and its immediate sibling in the DOM tree; second, in a well-formed DOM, the dynamic content will not be able to rearrange its location in the document without JavaScript execution (e.g., removing and adding elements), allowing us to isolate it from the template.

Pages commonly contain more than one vulnerable injection point. We discuss the difficulty of supporting these pages in Section 2.5.

We believe CVEs are an ideal growing source of signature definitions. Since previous client-side work does not focus on application-specific protection, these tools often use less accurate heuristics to detecting exploits. Furthermore, once new vulnerabilities are found, these systems often lack the maintainability obtained by leveraging active CVE development.

We are conscious that XSnare signatures will not write themselves, and that this task represents a new step in the workflow. Luckily, converting the CVE information into a signature does not require active participation from the application developers – Security enthusiasts and web developers have the skills to fulfill this role satisfactorily.

In general, we do not require the existence of a publicly disclosed CVE to be able to write a signature for an exploit, it is the process of its development that is useful to our approach (documenting an exploit and its cause). As described in Section 5, CVEs are a convenient way for us to test our system against real-world vulnerabilities. However, a knowledgeable analyst can write a signature without having publicly disclosed a CVE. In fact, for security measures, many CVEs are not publicly available until the application developer has patched its software. Our system can help defend against zero day attacks, as once a vulnerability is known, an analyst is able to write a signature for it as soon as they have knowledge of the issue.

Long term, we imagine that volunteers (or entrepreneurs) would cultivate and maintain the signature database. New signatures could be contributed by a community of amateur or professional security analysts, in a manner not so different from how antispam or antivirus software is managed.

The challenge of automatically deriving signatures from detailed CVEs is an interesting one, albeit outside the scope of this paper.

2.3 Firewall Signature Language

Our signature language needs to be such that it has enough power of expression for the signature writer to be precise, both for determining the correct web application and to identify the affected areas in the HTML. For injection point isolation, a language based on regular expressions suffices to express precise sections of the HTML. The following is the signature that defends against the motivating example of Section 2.1:

url: wp-admin/options-general.php?page=rcc-settings’,
software: WordPress’,
softwareDetails: responsive-cookie-consent’,
version: ’1.5’,
type: string’,
typeDet: single-unique’,
sanitizer: regex’,
config: ’/^[0-9](\.[0-9]+)?$/’,
[’<input id="rcc_settings[border-size]" name="rcc_settings[border-size]" type="text"
’<label class="description"
Listing 1: An XSnare signature

A description of the development process for this signature is given in Section 4.1. In summary, a signature will have the necessary information to determine whether a loaded page has a vulnerability, and specify appropriate actions for eliminating any malicious payloads.

Analysts configure their signatures with one function chosen from the static set of sanitization functions offered by XSnare. These functions inoculate potentially malicious injections based on the DOM context surrounding the injection. The goal of signatures is to provide such sanitization, ideally without “breaking” the user experience of the page. The default function preset is DOMPurify’s [10.1007/978-3-319-66399-9_7] default configuration, which takes care of common sanitization needs [safecontent]. However, DOMPurify’s defaults can be unnecessarily restrictive, in which cases the other sanitization methods are more desirable.

We considered allowing arbitrary sanitization code in signatures. While it would open complex sanitization possibilities, we have decided against it, principally for security reasons. The minimal set of functions we settled on also sufficed to express all of the signatures defined for this paper. See  Appendix B for more details.

2.4 Browser Extension

Our system’s main component is a browser extension which rewrites potentially infected HTML into a clean document. The extension detects exploits in the HTML by using signature definitions and maintains a local database of signatures. We leave the design of an update mechanism to future work, but in its current form, the database is bundled with each new installation of the extension.

The extension translates signature definitions into patches that rewrite incoming HTML on a per-URL basis, according to the top-down, bottom-up scan described in Section 2.1.

The extension’s detector acts as an in-network filter. We initially considered other designs but quickly found out that applying the patch at the network level was necessary for sanitization correctness: even before any code runs, parsing the HTML into a DOM tree might cause elements to be re-arranged into an unexpected order, making our extension sanitize the wrong spot. Consider the following example, where an element inside a <tr> tag is rearranged after parsing the string:

<table class="wp-list-table">
       <img src="1" onerror="alert(1)">
         <form method="GET" action=""> ...

In this HTML, the signature developer might identify the exploit as occurring inside the given table. However, if we wait until the string has been parsed into a DOM tree to sanitize, the elements are rearranged due to <tr> not allowing an <img> as its child:

<img src="1" onerror="alert(1)">
<table class="wp-list-table">
       <form method="GET" action=""> ...

Note that the injected <img> tag is now outside of the table, simply by virtue of the DOM parsing. Now, the extension will search past the injection, as it occurs before the table element, creating a false negative (FN). Similarly, elements rearranged inside an injection point can create false positives. This example would generate a class of circumvention techniques for our detector, so we can’t wait until the website has been rendered to analyze the response. This guarantees that a knowledgeable attacker can not take advantage of this behavior.

2.5 Handling multiple injections in one page

In 1, the endPoints were listed as two strings in the incoming network response. However, there are cases where arbitrarily many injection points can be generated by the application code, such as a for loop generating table rows. For these, it is hard to correctly isolate each endPoint pair, as an attacker could easily inject fake endPoints in between the original ones.

Figure 3: Example attacker injection when multiple injection points exist in the page. a) a basic injection pattern. b) an attempt to fool the detector.

In Figure 3a, the brackets indicate a template. The content in between is an injection point (the star), where dynamic content is injected into the template. In the case of a vulnerability, the injected content can expand to any arbitrary string. The signature separates the injection from the rest by matching for the start and end points (the endPoints), represented by the brackets. This HTML originally has two pairs of endPoint patterns.

In Figure 3b, the attacker knows these are being used as injection end points and decides to inject a fake ending point and a fake starting point (the dotted brackets), with some additional malicious content in between. If just looking for multiple pairs of end points, the detector cannot tell the difference between the solid and dotted patterns, and will not get rid of the content injected in the star. Therefore, we have to use the first starting point and the last ending point before a starting one (when searching from the bottom-up) and sanitize everything in between. This might get rid of a substantial amount of valid HTML, so we defer to the signature developer’s judgment of what behavior the detector should follow. We expand upon this further in Section 4.1.

Figure 4: Example attacker injection when multiple distinct injection points exist in the page.

Figure 4 illustrates a case when there are several injection points in one page, but each of them is distinct. Now, the filter is only looking for one pair of brackets, so the attacker can’t fool the extension into leaving part of the injection unsanitized. However, they could, for example, inject an extra ending bracket after the opening parenthesis (or an extra starting brace). The extension will be tricked into sanitizing non-malicious content, the black pluses (+). This behavior can be detected by noting that we know the order in which the endPoints should appear, and so if the filter sees a closing endPoint before the next expected starting endPoint, or similarly, a starting endPoint before the next expected closing endPoint, this attack can be identified. In the diagram, the order of the solid elements characterizes the possible malformations in the end points. As with the previous scenario, we have to sanitize the outermost end points, potentially deleting non-malicious content. The signature developer specifies the sanitization behavior.

Note that these complex cases do not mean that our approach is not always applicable. The process of writing the signature might become more complicated, but the extension provides a choice for blocking the page entirely if the signature writer believes a given case is too complex for our signature language.

2.6 Dynamic injections

The top-level documents of web pages fetch additional dynamic content via fetch or AJAX APIs. Content fetched in this way is also vulnerable to XSS, and must be filtered. An example vulnerability is CVE-2018-7747 (WordPress Caldera Forms, which allows malicious content retrieved from the plugin’s database to be injected in response to a click.

XSnare allows XHR requests to be filtered with xhr-type signatures. To reduce the number of signatures that need to be considered when a browser issues a request, we require that signatures for XHR be nested inside a signature for a top-level document. If a page’s main content matches an existing top-level signature description, XSnare will then enable all nested XHR listeners.

Signatures for dynamic requests are specified in the listenerData key, which includes a listener type and method. The idea is extensible to scripts and other objects loaded separately from the main document (e.g., images, stylesheets, etc.).

listenerData: [{
  listenerType: xhr’, listenerMethod: POST’,
  sanitizer: escape’, type: string’,
  listenerUrl: wp-admin/admin-ajax.php’,
  typeDet: single-unique’,
  endPoints: [’<p><strong>’, ’[AltBody]’]
Listing 2: An example dynamic request signature. This patches CVE-2018-7747.

3 Implementation

We implemented our system as an extension in Firefox 69.0. Our signatures are stored in a local JavaScript file in the extension package. We decided on an extension implementation for several reasons. (1) Privileged execution environment. The extension’s logic lies in a separate environment from the web application code. This guarantees that malicious code in the application cannot affect the extension. (2) Web application context. Our solution requires knowledge of the application’s context. The extension naturally retains this context. (3) Interposition abilities. As it lies within the browser, the extension can run both at the network level, e.g., rewrite an incoming response; and at the web application level, e.g., interpose on the application’s JavaScript execution.

1 //global DBSignatures
2 procedure verifyResponse (responseString, url)
3       loadedProbes = runProbes(responseString, url)
4       signaturesToCheck []
5       for probe in loadedProbes do
6             signaturesToCheck.append(DBSignatures[probe]) 
7       end for
8      filteredSignatures []
9       for signature in signaturesToCheck do
10             if responseString and url match signature then
11                   filteredSignatures.push(signature)
14       end for
15      versionInfo loadVersions(url, loadedProbes)
16       endPoints []
17       for signature in filteredSignatures do
18             if (signature,signature.version) versionInfo then
19                   endPoints.push(signature.endPointPairs)
22       end for
23      indices []
24       for endPointPair in endPoints do
25             indices.push(findIndices(responseString, endPointPair))
27       end for
28      if discrepancies exist in indices then
29             Block page load and return
31       for endPointPair in endPoints do
32             sanitize(responseString,indices)
34       end for
36 end
Algorithm 1 Network filter algorithm

3.1 Filtering process

Algorithm 1 describes our network filtering process: once a request’s response comes in through the network, we process it and sanitize it if necessary.

Loading signatures

Our detector loads signatures and finds injection points in the document. However, not all signatures need to be loaded for a specific website, since not all sites run the same frameworks. When loading signatures, we proceed in a manner similar to a decision tree. The detector first probes the page (line 3) to identify the underlying framework (the

software in our signature language). We currently provide a number of static probes. However, as more applications are required to be included, we believe it would be better to cover this task in the signature definitions. The widely popular network mapping tool Nmap [nMap] uses probes in a similar manner, kept in a modifiable file. As mentioned in Section 5, we currently only have signatures for CMS applications. Our probes use specific identifiers related to the application, as well as the particular site that is affected by the exploit. WordPress pages, for example, have several elements in the page that identify it as a WordPress page. While this might seem easier for CMS style pages, and we acknowledge that application fingerprinting is a hard problem in general, we believe other web apps will also have similar identifying information, like headers, element ID’s, script/CSS sources, etc.

After running these probes, the detector loads corresponding frameworks’ signatures and filters out checks whether the information of each loaded signature matches the page (lines 5-12).

Version identification We then apply version identification (lines 13-16). Our objective for versioning is that our signatures don’t trigger false positives on websites running patched software. We found this to be one of the harder aspects of signature loading. In many Content Management Systems, for example, file names are not updated with the latest versions, or do not include them at all, and thus, this information is often hard to come by from the client-side perspective. This information is often more available to admins of a site. While this might not be the bulk of users, it is the bulk of disclosed CVEs, as described in Section 5.

Furthermore, we believe that even if we load a signature when the application has already been fixed at the server-side, it will often preserve the page’s functionality, as many of the CVEs are a result of unsanitized input. Motivated by this observed behavior, our mechanism follows a series of increasingly accurate but less applicable version identifiers: first, we apply framework specific version probes. If these are not successful, the signature language provides functionality for version identification in the HTML through regex. If information is unavailable through the HTML, the version in the signature is left blank and the patch is applied regardless of version, as we can not be sure the page is running patched software. Our tool takes advantage of having perfect knowledge of an exploit’s conditions, which reduces the rate of false positives compared to a software-agnostic approach.

Injection point search and sanitization Once we have the correct signatures, we find the indices for the endpoints using our top-down, bottom-up scan, and need to check for potential malformations in the injection points, as described in Section 2.5 (lines 19-24). The page load is blocked and a message is returned to the user, or if the signature developer specifies so, sanitization proceeds on the new endpoints. Finally, if all endPoint pairs are in the expected order, we sanitize each injection point (lines 25-27).

3.2 Sanitization methods

We provide different types of sanitization: "DOMPurify", "escape", and "regex". Regex Pattern matching can be particularly effective when the expected value has a simple representation (e.g., a field for only numbers). For each of these approaches, the signature can specify a corresponding config value. DOMPurify provides a rich API for additional configuration. When escaping, defining specific characters to escape via regex can be useful. For pattern matching via regex, config specifies the value the injected should match.

4 Writing Signatures

We expect a signature developer to have a solid understanding of the principles behind XSS, as well as web applications, HTML, CSS and JavaScript, so they can identify precise injection points. In this section, we aim to show that minor effort is required from a knowledgeable analyst when writing a signature.

4.1 Case Study: CVE-2018-10309

Going back to our example in Section 2.1, we describe the process for writing a signature using one of our studied CVEs.

Identifying the exploit. An entry in Exploit Database [studyCVE] describes a persistent XSS vulnerability in the WordPress plugin Responsive Cookie Consent for versions 1.7/1.6/1.5. This entry (as most do) comes with a proof of concept (PoC) for the exploit, which describes the Cookie Bar Border Bottom Size parameter as vulnerable. We run a local WordPress installation with this plugin. In general, the system does not rely on the existence of a PoC, we personally relied on this as we did not discover the CVEs and did not have the full context of the exploit.

Establishing the separation between dynamic and static content. We insert the string "scriptalert(’XSS’)/script in the Cookie Bar Border Bottom Size (rcc_settings[border-size] in the HTML) input field. This results in an alert box popping up in the page.

In general, the analyst is able to find the vulnerable HTML from the server-side code without having to reproduce the exploit. Since we did not write the CVE, we had to do this extra step.

In the example, the input element is the injection starting point, and the label tag is the end point, since it is the element immediately after the input. Identification of correct endpoints is extremely important, and in particular, when a page has multiple injection points, the signature developer must ensure the chosen elements do not overlap with other innocuous ones. In some cases, the developer might think it best to completely stop the page from loading. While one of our main goals is to maintain the page’s usability, there are cases where large portions of the document would be affected by the sanitization. We believe compromising usability for security is preferable in this case. Furthermore, the developer has to identify if the exploit comes from an external source (such as an Ajax request), as this changes the signature.

Collecting other required page information and writing the signature. The next step is to gather the remaining information to determine whether the signature applies to the page loaded. The full signature for this example was previously shown in 1. The URL is acquired by noting that this exploit occurs on the plugin’s settings page. The software running is WordPress in this case. The settings page’s HTML includes a link to a stylesheet with href "http://localhost:8080/wp-content/plugins/responsive-cookie-consent…", in particular, "wp-content/plugins/plugin-name" is the standard way of identifying that a WordPress page is running a certain plugin, in this case, "responsive-cookie-consent", set as softwareDetails. We apply the signature for all versions less than or equal to 1.7. Since the exploit only occurs in this specific spot in the HTML, the typeDet is listed as "single-unique". Since the vulnerable parameter is a border-size, the sanitizer applied is "regex", further restricting the pattern to only numbers in config. We list the endPoints as taken from the HTML.

Testing the signature. Finally, we load up our extension and reload the web page. We expect to not have an alert box pop up, and we manually look at the HTML to verify correct sanitization. In practice, there might be small discrepancies between server-side and client-side representations of the HTML string, leading to bugs in the signature if the developer used the parsed HTML as a reference. If the exploit is not properly sanitized, the developer is able to use the debugging tools provided by the browser to check the incoming network response information seen by the extension’s background page and make sure it matches the signature values.

5 Approach evaluation

To verify the applicability of our detector and signature language, we tested the system by looking at several recent CVEs related to XSS. We have three objectives: to verify that our signature language provides the necessary functionality to express an exploit and its patch, to test our detector against existing exploits, and to show that composing signatures takes a reasonable amount of time.

5.1 Methodology

We study recent CVEs related to WordPress plugins. We focus on WordPress for two reasons:

  1. WordPress powers 34.7% of all websites according to a recent survey [w3stats] [DBLP:journals/corr/abs-1801-01203]. The same study states that 30.3% of the Alexa top 1000 sites use WordPress. Thus, we can be confident that our study results will hold true for the average user.

  2. WordPress plugins are popular among developers (there are currently more than 55,000 plugins [wpplugins]). Due to its user popularity, WordPress is also heavily analyzed by security experts. A search for WordPress CVEs on the Mitre CVE database [cvemitre] gives 2310 results. Plugins, specifically, are an important part of this issue, 52% of the vulnerabilities reported by WPScan are caused by WordPress plugins [wpscan].

We used a CVE database, CVE Details [cvedetails] to find the 100 most recent WordPress XSS CVEs, as of October 2018. For each CVE, we set up a Docker container with a clean installation of WordPress 5.2 and installed the vulnerable plugin’s version. For CVEs that depended on a particular WordPress version, we installed the appropriate version. Of the CVEs we looked at, only one occurred in WordPress core. We believe it would be harder to precisely sanitize injection points in WordPress core, as many of the plugins have particular settings pages where the exploits occur, and the HTML is more identifiable. WordPress core, on the other hand, can be heavily altered by the use of themes and the user’s own changes. However, as evidenced by our investigation, the vast majority of exploits occur in plugins.

Next, we reproduced the exploit in the CVE and we analyzed the vulnerable page and wrote a signature to patch the exploit.

5.2 Results

Plugin Installations
WooCommerce 5+ million
Duplicator 1+ million
Loginizer 900,000+
WP Statistics 500,000+
Caldera Forms 200,000+
Table 1: Most popular studied WordPress plugins

Of the initial 100 CVEs, we were able to analyze 81 across 44 affected pages. We dropped 24 CVEs due to reproducibility issues: some of the descriptions did not include a PoC, making it difficult for us to reproduce; or, the plugin code was no longer available. In some cases, it had been removed from the WordPress repository due to "security issues", which emphasizes the importance of being able to defend against these attacks.

The resulting plugins we studied averaged 489,927 installations: Table 1 shows the number of installations for the 5 most popular plugins we studied. For the vulnerabilities, 27 (35.5%) could be exploited by an unauthenticated user; 56 (73.7%) targeted a high-privilege user as the victim, 7 (9.2%) had a low-privilege user as the victim, the rest affected users of all types.

Many of the studied CVEs included attacks for which there are known and widely deployed defenses. For example, many were cases of Reflected XSS, where the URL revealed the existence of an attack, e.g.,: http://<target>&page-uri=<script>alert("

6 Load time performance on top websites

XSnare’s performance goal is to provide its security guarantees without impacting the user’s browsing experience. We now briefly report XSnare’s impact on top website load times, representing the expected behaviour of a user’s average web browsing experience. For more performance evaluation results please see  Appendix A.

For these tests we used the top 500 websites as reported by [top500]. For each website, we loaded it 25 times (with a 25 second timeout) and recorded the following values: requestStart, responseEnd, domComplete, and decodedBodySize. From the initial set of 500, we only report values for 441: the other 59 had consistent issues with timeouts, insecure certificates, and network errors. In our setup, we used a headless version of Firefox 69.0, and Selenium WebDriver for NodeJS, with GeckoDriver. All experiments were run on one machine with an Intel Xeon CPU E5-2407 2.40GHz processor, 32 GB DRAM, and our university’s 1GiB connection.

We ran four test suites: No extension cold cache: Firefox is loaded without the extension installed and the web driver is re-instantiated for every page load. Extension cold cache: As before, but Firefox is loaded with the extension installed. No extension warm cache: Firefox is loaded without the extension installed and the same web driver is used for the page’s 25 loads. Extension warm cache: As before, but Firefox is launched with the extension installed.

For each set of tests, we reduced the recorded values to two comparisons: network filter (responseEnd - requestStart), and page ready (domComplete - responseStart). The first analyzes the time spent by the network filter, while the second determines the time spent until the whole document has loaded. We calculate the medians for each website for each of these measures as well as the decodedByteSize.

Figure 5: Cumulative distribution of relative percentage slowdown with extension installed for top sites.

We compare the load times with/without the extension by calculating the relative slowdown with the extension installed according to the following formula:

where is the median with/without the extension running.

Figure 5 plots the results. The graph shows a slowdown of less than 10% for 72.6% of sites, and less than 50% for 82% of sites when the extension is running. Note that these values are recorded as percentages, and while some are as high as 50%, the absolute values are in 77% of cases less than a second. This overhead should not alter the user’s experience significantly.

The slowdown increases by at most 5% when we take caching into account. This is likely because the network filter causes the browser to use less caching, especially for the DOM component, as it might have to process it from scratch every time. While it may seem counter-intuitive that some pages have shorter loading times with the extension, there are several variables at play that can affect these measurements (local network, server-side load, internal scheduling, etc). We manually checked the websites for which values were higher than |40%| and verified that our extension did not change the page’s contents, a possible cause of faster load times. We also checked the timings for the page as reported by the browser and noted a high variance even within small time windows. The time spent by our verification function was less than 10ms for 87.6% of sites (

Figure 7). This corroborates our findings that the slowdown is mostly negligible.

7 Limitations and Future Work

Generalizability and scope of study. As discussed in Section 5.1, while many websites share similar structures to the ones we covered, our study only considered 4 other sites apart from those running on WordPress, and we only considered sites using a CMS. Not all websites might be identified as easily. Furthermore, we only studied 81 CVEs. In the future we intend to study a more diverse set of CVEs and go beyond CMS-based sites.

False positives and false negatives. Due to the nature of our approach, it is extremely hard to completely get rid of FPs: If the applied sanitization targets JavaScript code, for example, a FP will likely be triggered. Furthermore, since we rely on handwritten signatures to defend against attacks, vulnerable sites for which no signature has been written will be subject to FNs. In the future, we intend to study the rate of FPs and FNs in our approach and compare it to previous work.

Usability. A main aspect of our work is its increased potential for usability and adoption from both a user’s perspective that installs the extension to defend themselves against XSS, and a signature developer who has to write the database descriptions according to a known CVE. Future work could focus on usability studies related to both of these aspects.

Protection against CSRF. We believe that we can adapt our work to defend against Cross-Site Request Forgery (CSRF) exploits, as well. Using a similar signature language as the one for XSS, a signature developer could specify pages with potential vulnerabilities to only allow network requests that cannot exploit such vulnerabilities.

Dealing with an increasing number of signatures. As the number of framework probes increases, and more types of sites are covered, the performance impact will increase. Using more efficient approaches to searching and filtering, and using better data structures in the signature database could to lower this overhead.

Design considerations. Currently, each browser user has to install our extension. However, the same functionality could be offloaded to a single processing unit similar to a proxy, which can handle the filtering for all machines in a network. This deployment model might be more appropriate in certain environments, such as in enterprises.

8 Related Work

We classify existing work into several categories: client-side, server-side, browser built-in, and hybrid: a combination of these.

Server-side techniques. In addition to existing parameter sanitization techniques, taint-tracking has been proposed as a means to consolidate sanitization of vulnerable parameters [Xu:2006:TPE:1267336.1267345, DBLP:conf/sec/Nguyen-TuongGGSE05, Pietraszek:2005:DAI:2146257.2146267, Bisht:2008:XPD:1428322.1428325]. These techniques are complementary to ours, and provide an additional line of defence against XSS. However, many of them rely on the client-side rendering to maintain the server-side properties, which will not always be the case.

Client-side techniques. DOMPurify [10.1007/978-3-319-66399-9_7] presents a robust XSS client-side filter. The authors argue that the DOM is the ideal place for sanitization to occur. While we agree with this view, their work relies on application developers to adopt their filter and modify their code to use it. Thus, we have partly automated this step by including it as our default sanitization function.

Jim et al. [Jim:2007:DSI:1242572.1242654] present a method to defend against injection attacks through Browser-Enforced Embedded Policies. This approach is similar to ours, as the policies specify prohibited script execution points. However, this again relies on application developers knowing where their code might be vulnerable. Furthermore, browser modifications are required to benefit from it. Similarly, Hallaraker and Vigna [Hallaraker:2005:DMJ:1078029.1078861] use a policy language to detect malicious code on the client-side. Like XSnare, they make use of signatures to protect against known types of exploits. However, unlike our approach, their signatures are not application-specific, and there is no model for signature maintenance. Furthermore, there is no evaluation on the efficacy of their signatures.

Snyder et al. [Snyder:2017:MWD:3133956.3133966] report a study in which they disable several JavaScript APIs and test the number of websites that are do not work without the full functionality of the APIs. This approach increases security due to vulnerabilities present in several JavaScript APIs, however, we believe disabling API functionality should only be used as a last resort.

Similarly to server-side defences, taint-tracking has been applied at the client-side: DexterJS provides a robust, browser-independent platform for auto-patching DOM-based XSS [10.1145/2786805.2786821, 10.1145/2786805.2803191]. While this approach effectively defends against a large number of attacks automatically, it only covers a subset of possible XSS attack. This applies to any client-side defence that is unaware of an application’s server-side code.

Browser built-in defences. Browsers are equipped with several built-in defences. We previously described XSS Auditor in Section 1, another important one is the Content Security Policy (CSP[CSP]. It has been widely adopted and in many cases provides developers with a reliable way to protect against XSS and CSRF attacks. However, CSP requires the developer to identify which scripts might be malicious.

Client and server hybrids. XSS-Dec [Sundareswaran:2012:XHS:2352970.2352994] uses a proxy which keeps track of an encrypted version of the server’s source files, and applies this information to derive exploits in a page visited by the user. This approach is similar to ours, since we assume previous knowledge of the clean HTML document. Furthermore, they use anomaly-based and signature-based detection to prevent attacks. However, there is no mention of signature maintenance. In a way, our system offloads all this functionality to the client-side, without the need for any server-side information.

9 Conclusion

Users cannot depend on administrators to patch vulnerable server-side software or for developers to adopt best practices to mitigate XSS vulnerabilities. Instead, users should protect themselves with a client-side solution. In this paper we described the design, implementation, and evaluation of XSnare, one such client-side approach. XSnare prevents XSS exploits by using a database of exploit signatures and by using a novel mechanism to detect XSS exploits in a browser extension. We evaluated XSnare through a study of 81 CVEs in which we showed that it defends against 94.2% of the exploits.


Appendix A Performance evaluation

In this section we report additional performance measurements for XSnare.

Methodology. We recorded timestamps while our code is executing using the Performance Web API222Note that while this API normally reports values as doubles, due to recent security threats, such as Spectre [DBLP:journals/corr/abs-1801-01203], several browser developers have implemented countermeasures by reducing the precision of the DOMHighResTimeStamp resolution [reducetimeprecision, resolutionconsiderations]. In particular, Firefox reports these values as integer milliseconds. For our tests, we re-enabled higher precision values.

While our extension’s functionality only applies at the network level, there is potential slowdown at the DOM processing level due to the optimization techniques the browser applies throughout several levels of the web page load pipeline. Figure 6 shows the different timestamps provided by the Navigation Timing API [navigationtiming], as well as a high-level description of the browser processing model. Since our filter listens on the onBeforeRequest event, none of the previous steps before Request are affected. In this section, we refer to the difference in time between responseEnd and requestStart as the "network filter time".

Figure 6: The Navigation Timing API’s timestamps444This image was taken from the w3 spec:

a.1 Top websites load times; continued

Figure 7: Scatter plot of network filter time as a function of character length for top sites.

Figure 7 shows the time spent by the call to our string verification function in the network filter as a function of the length of the string to be verified. The blue dots are the pages for which our framework probes tested negative, and the green triangles are the pages for which the probes tested positive: 55 in total. We applied least squares regression to calculate the shown trend lines. The Spearman’s rank 555The Spearman’s correlation coefficient measures the strength and direction of association between two ranked variables: correlation values for no probe, probe, and overall are 0.91, 0.91, and 0.72 respectively, demonstrating positive correlation. Since both our probes and signatures use regex matching, we expect both trend lines to be linear, as seen in the graph. Recall that once a probe for a certain software passes, we perform a linear scan over the signatures for that specific software and check whether it applies to the given HTML string or not. Thus, we expect the slope of the line to be higher when a probe passes. Around 37.4% of all web sites use frameworks covered by our probes [w3stats], thus, we expect the impact of our network filter to be closer to the non-probe values, as corroborated by our overall trend line.

False positives on the Web. Additionally, for each website, we recorded the number of loaded signatures. We report a 0% FP rate for loaded signatures. Thus, we can infer with confidence that the rate of false positives for loaded signatures during an average user’s web browsing is similarly low. This rate could possibly go up as the number of signatures and covered frameworks increases. It is likely that these websites are free of vulnerabilities covered by our signatures, as many of these websites are not running WordPress to begin with, and being the most popular, they would likely be updated quickly if a vulnerability is found; thus, the rate of false negatives is likely extremely low as well.

a.2 WordPress websites load times

We ran similar experiments as in Section 6.1, but with the WordPress sites described in Section 4.1. Thus, all of these have either one or multiple injection points in their HTML, and the network filter will spend an additional amount of time sanitizing these as defined by the signatures. Note that the data set is smaller here, and some of the trends might be harder to infer.

Figure 8 shows the results for slowdown with the extension running for these sites. Recall that the only difference between a page which passes the WordPress probe and one that matches a signature is that the latter has to replace a portion of the original string by its sanitized version. In this case we see a slowdown of less than 10% for 60% of sites, and less than 40% for 96.25% of them. The warm network filter curve suffers from a particularly high slowdown. We believe this to be the case because the locally hosted pages decrease the network component time, causing any overhead to be seen as relatively high. However, as 48% of the original values were below 60ms, conclude a small impact on user experience as well.

Figure 8: Cumulative distribution of relative percentage slowdown with extension installed for WordPress sites.

Finally, we report the string verification time as a function of its length, for the WordPress sites, shown in Figure 9. The Spearman’s rank correlation for this set of data is 0.630.

Figure 9: Scatter plot of network filter time as a function of character length for WordPress sites.

Appendix B Signature Language Specification

We provide a description of our signature language, in particular in the context of WordPress:

  • url: If the exploit occurs in a specific URL or subdomain, this is defined as a string, e.g. /wp-admin/options-general.php?page=relevanssi%2Frelevanssi.php, otherwise null.

  • software: The software framework the page is running if any, e.g. WordPress. A hand-crafted page might not have any identifiable software.

  • softwareDetails: If running any software, this provides further information about when to load a signature. For WordPress, these are plugin names as depicted in the HTML of a page running such plugin.

  • version: The version number of the software/plugin/page. This is used for versioning of the software run by the page, as described in Section 3.1.

  • type: A string describing the signature type. A value of "string" describes a basic signature. A value of ’listener’ describes a signature which requires an additional listener in the background page for network requests.

  • sanitizer: A string with one of the following values: "DOMPurify", "escape", and "regex". This item is optional, the default is DOMPurify.

  • config: The config parameters to go along with the chosen sanitizer, if necessary. For "DOMPurify", the accepted values are as defined by the DOMPurify API (i.e, DOMPurify.sanitize(dirty, config). For "escape", an additional escaping pattern can be provided. For "regex", this should be the pattern to match with the injection point content.

  • typeDet: A string with the following pattern: ’occurrence-uniqueness’, ’ocurrence’ has values single/multiple, which describes the existence of one or multiple independent injection points; the ’uniqueness’ has values unique/several, specifying whether an injection point occurs once or several times throughout the document, as described in Section 2.4.

  • endPoints: An array of startpoint and endpoint tuples, specified as strings for regex matching.

  • endPointsPositions: An array of integer tuples. These are optional but useful when the one of the endPoints HTML are used throughout the whole page and appear a fixed number of times. For example: if an injection ending point happens on an element <h3 class=’my-header’>, this element might have 10 appearances throughout the page. However, only the 4th is an injection ending point. The signature would specify the second element of the tuple to be 7, as it would be the 7th such item in a regex match array (using 1-based indexing), counting from the bottom up. For ending points, we have to count from the bottom up because the attacker can inject arbitrarily many of these elements before it, and vice versa for starting points.

Additionally, if the value of type is ‘listener’, the signature will have an additional field called listenerData. Similarly to a regular signature, this consists of the following pieces of information:

  • listenerType: The type of network listener as defined by the WebRequest API (e.g. ‘script’, ‘XHR’, etc.)

  • listenerMethod: The request’s HTTP method, for example "GET" or "POST".

  • url: the URL of the request target.

If a listener is present, the signature’s fields can be used to specify the listener’s request injection points.