21
How ad blockers can be used for browser fingerprinting
In this article, we show how signals generated by the use of an ad blocker can improve browser fingerprinting accuracy. This novel browser fingerprinting method, while oft-discussed as a theoretical source of entropy, has only just been added to FingerprintJS as of April 2021, and has never been fully described until now. Ad blockers are an incredibly pervasive and useful piece of technology. Around 26% of Americans use an ad blocker today. If you are reading this article on ad blocker technology, you almost undoubtedly have one installed.
While ad blockers make the internet a more pleasant experience for many people, whether or not they protect your privacy in any meaningful way is up for debate. As ad blockers have access to the content of all pages that a browser loads and can reliably perform cross-site tracking, they are able to collect more information on a user’s browsing activity than most marketing trackers they block.
Perhaps more insidiously, the fact that a user is attempting to avoid being tracked online with an ad blocker can be used to identify them. Consider the example of tracking an individual in the woods by their shoe print. You may find success if you know their shoe’s size and ridge pattern, but it may be just as easy if you know that person habitually covers their tracks by raking a branch over their path. Whether you are looking for a shoe print or the absence of one, a signature pattern can be found.
Ad blockers leave a trace that can be harnessed by the websites you visit to identify you. By testing whether certain page elements are blocked, a site can find discrepancies in the filters used by your specific ad blocker(s). These discrepancies provide a source of entropy that when combined with other unique signals, can identify a specific user over multiple visits. This combining of browser signals to create a unique identifier is known as browser fingerprinting.
While browser fingerprinting is a proven-out method of visitor identification (you can read more about how it works in our beginner’s guide), how ad blockers can be used for fingerprinting is rarely discussed. As the developers of the largest open source browser fingerprinting library, we have only started including ad blocker signals as of April 2021, so this work is hot off the press from our team. We hope shining a light on this cutting edge technique will be useful to the open source community at large.
An ad blocker is a browser extension that prevents browsers from loading video and displaying advertisements, pop-ups, tracking pixels and other third-party scripts.
Ad blockers not only improve the online experience by hiding ads, but also protect browsing activity from being tracked by third-party scripts. All major online ad platforms (like Google and Facebook), as well as other marketing and product testing tools (like Crazy Egg and Hotjar) use tracking scripts to monitor and monetize user activity online. Privacy conscious users often turn to ad blockers to stop their browsing history from being shared with these platforms.
However, ad blockers have access to the content of all pages that a browser loads. They have a lot more information about browsing activity than trackers, because trackers can’t do reliable cross-site tracking. Therefore, it is possible for ad blockers to violate user privacy.
Safari is an exception which we’ll discuss below.
In this section we’ll go fairly deep into the internals of ad blockers as it will help us build a better understanding of how ad blocking mechanics make it possible to reliably identify visitors.
Ad blockers typically run as extensions built on top of browser APIs:
- Google Chrome and other Chromium-based browsers: Extensions are JavaScript applications that run in a sandboxed environment with additional browser APIs available only to browser extensions. There are two ways ad blockers can block content. The first one is element hiding and the second one is resource blocking:
- Element hiding is done either by injecting CSS code, or by using DOM APIs such as querySelectorAll or removeChild.
- Resource blocking employs a different technique. Instead of rendering elements on a page and then hiding them, extensions block the resources on a browser networking level. To plug into browser networking, ad blockers will intercept requests as they happen or use declarative blocking rules defined beforehand. Request interception utilizes webRequest API, which is the most privacy violating technique. It works by reading every request that a browser is making and deciding on the fly if it represents an ad and should be blocked. The declarative approach utilizes declarativeNetRequest API to preemptively instruct browsers what needs to be blocked. This happens without reading actual requests, thus providing more privacy.
- Firefox: This API is almost the same as in Google Chrome. The only notable difference is the lack of declarativeNetRequest API.
- Safari: Unlike Chrome or Firefox, Safari extensions are native applications. Safari provides a declarative API for ad blockers. Ad blockers create static lists of things that describe what to block, and pass them to Safari. A list will contain rules that tell what network requests, HTML elements or cookies to block. A list content may also depend on user settings. Ad blockers have no way of accessing browsing history in Safari. You can watch a video by Apple with a detailed explanation.
Android browsers are a special case, in that they generally lack extension APIs. However, Android Play Market allows you to install ad-blocking apps that will work in all browsers. These apps will create a VPN on the system level and pass all the device traffic through it. The VPN connection will act as an ad blocker by adding JavaScript code or CSS styles to pages that will hide unwanted content, or by blocking HTTP requests entirely.
Ad blockers prevent ads from being shown by looking for specific elements to block within the site’s contents. To identify these advertising elements, ad blockers use collections of rules called "filters" to decide what to block.
Usually these filters are maintained by the open source community. Like any other project, filters are created by different people for different needs. For example, French websites often use local ad systems that are not known worldwide and are not blocked by general ad filters, so developers in France will want to create a filter to block ads on French websites. Some filter maintainers can have privacy concerns and thus create filters that block trackers.
A filter is usually a text file that follows a common standard called "AdBlock Plus syntax". Each line of text contains a blocking rule, which tells an ad blocker which URLs or CSS selectors must be blocked. Each blocking rule can have additional parameters such as the domain name or the resource type.
A blocking rule example is shown below:
The most common sets of filters used by AdBlock, AdGuard and other ad blockers include:
- EasyList: includes EasyList, EasyPrivacy, EasyList Cookie List, EasyList Germany and many others.
- AdGuard: includes a base filter, a mobile ads filter, a tracking protection filter, a social media filter and many others.
- Fanboy: includes Enhanced Trackers List, Anti-Facebook Filters, Annoyance List and several others.
Our goal is to get as much information from ad blockers as possible to generate a fingerprint.
A JS script running on a page can't tell directly if the browser has an ad blocker, and if it does, what is blocked by it. Instead, the script can try adding something on the page to see if it gets blocked. The addition can be an HTML element that matches a blocked CSS selector or an external resource such as a script or an image.
We recommend using CSS selectors over resources to detect ad blockers, as resource detection has two significant downsides. Firstly, detecting whether a resource is blocked requires trying to download the resource by making an HTTPS request and watching its state. This process slows down the web page by occupying the network bandwidth and CPU. Secondly, the HTTPS requests will appear in the browser developer tools, which may look suspicious to an observant site visitor. For these reasons, we will focus on using CSS selectors to collect data in this article.
We will now run through how to generate two related data sources using ad blocker signals: the list of blocked CSS selectors, and the list of filters. Finding the list of filters will result in a significantly more stable fingerprint, but requires additional work to identify unique CSS selectors to distinguish each filter from one another.
The process of detecting whether a CSS selector is blocked consists of the following steps:
- Parse the selector, i.e. get the tag name, CSS classes, id and attributes from it;
- Create an empty HTML element that matches that selector and insert the element into the document;
- Wait for the element to be hidden by an ad blocker, if one is installed;
- Check whether it's hidden. One way to do it is checking the element's offsetParent property (it's null when the element is hidden).
If you do the above steps for each selector, you'll face a performance issue, because there will be a lot of selectors to check. To avoid slowing down your web page, you should create all the HTML elements first and then check them to determine if they are hidden.
This approach can generate false positives when there are a lot of HTML elements added to the page. It happens because some CSS selectors apply only when an element has certain siblings. Such selectors contain a general sibling combinator (~) or an adjacent sibling combinator (+). They can lead to false element hiding and therefore false blocked selector detection results. This problem can be mitigated by inserting every element into an individual < div> container so that each element has no siblings. This solution may still fail occasionally, but it reduces the false positives significantly.
Here is an example code that checks which selectors are blocked:
async function getBlockedSelectors(allSelectors) {
// A storage for the test elements
const elements = new Array(allSelectors.length)
const blockedSelectors = []
try {
// First create all elements that can be blocked
for (let i = 0; i < allSelectors.length; ++i) {
const container = document.createElement('div')
const element = selectorToElement(allSelectors[i])
elements[i] = element
container.appendChild(element)
document.body.appendChild(container)
}
// Then wait for the ad blocker to hide the element
await new Promise(resolve => setTimeout(resolve, 10))
// Then check which of the elements are blocked
for (let i = 0; i < allSelectors.length; ++i) {
if (!elements[i].offsetParent) {
blockedSelectors.push(allSelectors[i])
}
}
} finally {
// Then remove the elements
for (const element of elements) {
if (element) {
element.parentNode.remove()
}
}
}
return blockedSelectors
}
// Creates a DOM element that matches the given selector
function selectorToElement(selector) {
// See the implementation at https://bit.ly/3yg1zhX
}
getBlockedSelectors(['.advertisement', 'img[alt="Promo"]'])
.then(blockedSelectors => {
console.log(blockedSelectors)
})
To determine which CSS selectors to check, you can download some of the most popular filters and extract the CSS selectors that are blocked on all websites. The rules for such selectors start with ##.
Your chosen selectors should contain no < embed>, no fixed positioning, no pseudo classes and no combinators. The offsetParent check will not work with either < embed> or fixed positioning. Selectors with combinators require a sophisticated script for building test HTML elements, and since there are only a few selectors with combinators, it isn't worth writing such a script. Finally, you should test only unique selectors across all the filters to avoid duplicate work. You can see a script that we use to parse the unique selectors from the filters here.
You can see some of the selectors blocked by your browser in the interactive demo on our blog.
A better way to get identification entropy from ad blockers is detecting which filters an ad blocker uses. This is done by identifying unique CSS selectors for each filter, so that if a unique selector is blocked, you can be sure a visitor is using that filter.
The process consists of the following steps:
- Identify which selectors are blocked by each filter. This step will be done once as a preparation step.
- Get unique selectors by filter. This step will also be done once as a preparation step.
- Check whether each unique selector is blocked. This step will run in the browser every time you need to identify a visitor.
These three steps are explained in more detail below.
To get the selectors blocked by a filter we can’t just read them from the filter file. This approach will not work in practice because ad blockers can hide elements differently from filter rules. So, to get a true list of CSS selectors blocked by a filter, we need to use a real ad blocker.
The process of detecting which selectors a filter really blocks is described next:
- Make an HTML page that checks every selector from the filters you want to detect. The page should use the process described in the previous section (detecting the list of blocked CSS selectors). You can use a Node.js script that makes such an HTML page. This step will be done once as a preparation step.
- Go to the ad blocker settings and enable only the filter we’re testing;
- Go to the HTML page and reload it;
- Save the list of blocked selectors to a new file.
Repeat the steps for each of the filters. You will get a collection of files (one for each filter).
Some filters will have no selectors, we won’t be able to detect them.
Now, when you have selectors that are really blocked by each of the filters, we can narrow them down to the unique ones. A unique selector is a selector that is blocked by only one filter. We created a script that extracts unique selectors. The script output is a JSON file that contains unique blocked selectors for each of the filters.
Unfortunately, some of the filters have no unique blocked selectors. They are fully included into other filters. That is, all their rules are presented in other filters, thus making these rules not unique.
You can see how we handle such filters in our GitHub repository.
This part will run in the browser. In a perfect world we would only need to check whether a single selector from each of the filters is blocked. When a unique selector is blocked, you can be sure that the person uses the filter. Likewise, if a unique selector isn't blocked, you can be sure the person doesn't use the filter.
const uniqueSelectorsOfFilters = {
easyList: '[lazy-ad="leftthin_banner"]',
fanboyAnnoyances: '#feedback-tab'
}
async function getActiveFilters(uniqueSelectors) {
const selectorArray = Object.values(uniqueSelectors)
// See the snippet above
const blockedSelectors = new Set(
await getBlockedSelectors(selectorArray)
)
return Object.keys(uniqueSelectors)
.filter(filterName => {
const selector = uniqueSelectors[filterName]
return blockedSelectors.has(selector)
})
}
getActiveFilters(uniqueSelectorsOfFilters)
.then(activeFilters => {
console.log(activeFilters)
})
In practice, the result may sometimes be incorrect because of wrong detection of blocked selectors. It can happen for several reasons: ad blockers can update their filters, they can experience glitches, or page CSS can interfere with the process.
In order to mitigate the impact of unexpected behavior, we can use fuzzy logic. For example, if more than 50% of unique selectors associated with one filter are blocked, we will assume the filter is enabled. An example code that checks which of the given filters are enabled using a fuzzy logic:
const uniqueSelectorsOfFilters = {
easyList: ['[lazy-ad="leftthin_banner"]', '#ad_300x250_2'],
fanboyAnnoyances: ['#feedback-tab', '#taboola-below-article']
}
async function getActiveFilters(uniqueSelectors) {
// Collect all the selectors into a plain array
const allSelectors = [].concat(
...Object.values(uniqueSelectors)
)
const blockedSelectors = new Set(
await getBlockedSelectors(allSelectors)
)
return Object.keys(uniqueSelectors)
.filter(filterName => {
const selectors = uniqueSelectors[filterName]
let blockedSelectorCount = 0
for (const selector of selectors) {
if (blockedSelectors.has(selector)) {
++blockedSelectorCount
}
}
return blockedSelectorCount > selectors.length * 0.5
})
}
getActiveFilters(uniqueSelectorsOfFilters)
.then(activeFilters => {
console.log(activeFilters)
})
Once you collect enough data, you can generate a visitor fingerprint.
There are dozens of ad blockers available. For example, AdBlock, uBlock Origin, AdGuard, 1Blocker X. These ad blockers use different sets of filters by default. Also users can customize ad blocking extensions by removing default filters and adding custom filters. This diversity gives entropy that can be used to generate fingerprints and identify visitors.
An example of an ad blocker customization:
A good browser fingerprint should stay the same when a user goes from regular to incognito (private) mode of the browser. Thus, ad blockers can provide a useful source of entropy only for browsers and operating systems where ad blockers are enabled by default in incognito mode:
- Safari on MacOS, iOS, iPadOS: browser extensions are enabled (including ad blockers) in both regular and incognito mode.
- All Browsers on Android: Ad blockers work on the system level, so they affect all browser modes.
Desktop Chrome and Firefox:
Extensions are disabled by default in incognito mode. Users however can manually choose to keep extensions enabled in incognito mode, but few people do so. Since we cannot know if a user has an ad blocker enabled in incognito mode, it makes sense to identify visitors by their ad blockers only in Safari and on Android.
You can make a fingerprint solely from the information that we’ve gotten from the visitor's ad blocker either by using the list of blocked selectors, or the list of filters from the sections above.
To make a fingerprint using selectors only, we take a list of selectors, check which of them are blocked and hash the result:
// See the snippet above
getBlockedSelectors(...)
.then(blockedSelectors => {
// See the murmurHash3 implementation at
// https://github.com/karanlyons/murmurHash3.js
const fingerprint = murmurHash3.x86.hash128(
JSON.stringify(blockedSelectors)
)
console.log(fingerprint)
})
This fingerprint is very sensitive but not stable. The CSS code of the page can accidentally hide a test HTML element and thus change the result. Also, as the community updates the filters quite often, every small update can add or remove a CSS selector rule, which will change the whole fingerprint. So, a fingerprint based on selectors alone can only be used for short-term identification.
To mitigate the instability of CSS selectors alone, you can use the list of filters instead to generate a fingerprint. The list of filters that a person uses is only likely to change if they switch ad blockers, or if their installed ad blocker undergoes a significant update. To make a fingerprint, get the list of enabled filters and hash it:
// See the snippet above
getActiveFilters(...).then(activeFilters => {
// See the murmurHash3 implementation at
// https://github.com/karanlyons/murmurHash3.js
const fingerprint = murmurHash3.x86.hash128(
JSON.stringify(activeFilters)
)
console.log(fingerprint)
})
As we mentioned above, the filter lists themselves are updated frequently. The updates can make the fingerprint change. The fuzzy algorithm mitigates this problem, but the underlying selectors will need to be updated eventually. So, you will need to repeat the process of collecting unique selectors after some time to actualize the data and keep the fingerprinting accuracy high.
The browser main thread is where it processes user events and paints. By default, browsers use a single thread to run all the JavaScript in the page, and to perform layout, reflows, and garbage collection. This means that long-running JavaScript can block the thread, leading to an unresponsive page and bad user experience.
The process of checking CSS selectors runs on the main thread. The algorithm uses many DOM operations, such as createElement and offsetParent. These operations can run only on the main thread and can't be moved to a worker. So, it's important for the algorithm to run fast.
We've measured the time it takes several old devices to check different numbers of CSS selectors per filter. We test only in the browsers where it makes sense to identify visitors by ad blockers. The tests were conducted in cold browsers on a complex page (about 500 KB of uncompressed CSS code). The results:
MacBook Pro 2015 (Core i7), macOS 11, Safari 14 | iPhone SE1, iOS 13, Safari 13 | Pixel 2, Android 9, Chrome 89 | |
---|---|---|---|
1 selector per filter (45 in total) | 3.1ms | 10ms | 5.7ms |
At most 5 selectors per filter (210 in total) | 9ms | 27ms | 17ms |
At most 10 selectors per filter (401 in total | 20ms | 20ms | 36ms |
All selectors (23029 in total) | ≈7000ms | ≈19000ms | ≈2600ms |
The more CSS selectors the algorithm checks, the more accurate the result will be. But a large number of CSS selectors increases the execution time and the code size. We have chosen to check 5 selectors per filter as a good balance between performance, stability and the code size.
You can see a complete implementation of the described algorithm in our GitHub repository.
Brave is a browser based on Chromium. It disables extensions in incognito mode by default. Thus, we don't perform ad blocker fingerprinting in Brave.
Desktop Tor has no separate incognito mode, so every extension works in all Tor tabs. Ad blockers can be used to identify Tor users. But the Tor authors strongly recommend not to install any custom extensions, and it's not easy to do so. Very few people will install ad blockers in Tor. So the effectiveness of ad blocker fingerprinting is low.
Ad blocker fingerprinting is one of the many signals our open source library uses to generate a browser fingerprint. However, we do not blindly incorporate every signal available in the browser. Instead we analyze the stability and uniqueness of each signal separately to determine their impact on fingerprint accuracy.
Ad blocker detection is a new signal and we’re still evaluating its properties.
You can learn more about stability, uniqueness and accuracy in our beginner’s guide to browser fingerprinting.
Browser fingerprinting is a useful method of visitor identification for a variety of anti-fraud applications. It is particularly useful to identify malicious visitors attempting to circumvent tracking by clearing cookies, browsing in incognito mode or using a VPN.
You can try implementing browser fingerprinting yourself with our open source library. FingerprintJS is the most popular browser fingerprinting library available, with over 14K GitHub stars.
For higher identification accuracy, we also developed the FingerprintJS Pro API, which uses machine learning to combine browser fingerprinting with additional identification techniques. You can use FingerprintJS Pro for free with up to 20k API calls per month.
- Star, follow or fork our GitHub project
- Email us your questions at [email protected]
- Sign up to our newsletter for updates
- Join our team to work on exciting research in online security: [email protected]
21