Ads.txt best practices to stay protected against ad fraud

Ads.txt, short for “Authorized Digital Sellers”, is an initiative launched by the IAB Tech Lab that allows publishers to publicly declare the ad partners that are allowed to sell their inventory.

Launched in May 2017, ads.txt was designed to clean up the online advertising supply chain by helping brands and marketers purchase inventory with confidence, while making it harder for fraudsters to profit from selling publisher inventory that doesn’t belong to them.

The initial adoption of ads.txt was slow, but thanks to a push from Google, it took off. According to First Impression’s industry ads.txt tracker, 45% of Alexa 1000 websites are now using the spec. That’s pretty high considering that not all sites in the Alexa 1000 list are publisher sites.

In the last few years, a handful of ad fraud schemes and bot networks have been exposed, which flourished in spite of ads.txt, and in some cases, from fraudsters exploiting vulnerabilities in how the spec works. However, in almost all those cases, the publishers most affected by the incidents were those who did not maintain their ads.txt file properly.

In this post, we’ll do a quick recap of the ads.txt spec, how it works, how fraudsters exploit its vulnerabilities, and some best practices that will help publishers stay protected.

Contents hide

1 What problems does ads.txt solve?

1.1 Inventory arbitrage

1.2 Domain spoofing

2 How does ads.txt work?

3 How fraudsters exploit ads.txt

4 Ads.txt best practices for publishers

4.1 Focus on direct partnerships

4.2 Keep your ads.txt file short

4.3 Validate the file to keep it error-free

4.4 Don’t rely on just ads.txt

5 While you're here...

What problems does ads.txt solve?

Inventory arbitrage

Inventory arbitrage is the practice of buying inventory, then repackaging and selling it at a profit. Those who engage in arbitrage essentially act as the middlemen of ad tech, they drive the price up for advertisers, and hurt publishers’ reputation by misrepresenting them in the open market.

While arbitrage is a shady practice that affects both advertisers and publishers, it is not illegal.

Domain spoofing

Domain spoofing, on the other hand, is a type of ad fraud commonly perpetrated in two ways.

First, using a malware injection, where a malicious script starts inserting ads on web pages where they don’t belong, including sites that don’t have any ads. Second, fraudsters can modify the ad tags provided by ad exchanges to fool advertisers into thinking that they are buying premium inventory, while their ads are delivered on an unrelated, low-quality website.

How does ads.txt work?

To get started, publishers host a plain text file titled “ads.txt” in the root directory of their domain.

Each line in the text file contains three things: The name of the ad exchange, the seller ID of the publisher’s domain on that exchange, and whether the exchange is a direct seller (“DIRECT”) or reseller “RESELLER”). Sometimes there’s a fourth optional item: the certificate ID issued to that publisher by the Trustworthy Accountability Group (TAG).

Here are a few lines from the ads.txt file hosted by the New York Times:

appnexus.com, 3661, DIRECT
google.com, pub-4177862836555934, DIRECT
indexexchange.com, 184733, DIRECT

Since the file is uploaded and maintained by publishers on their own domain, it’s not easy for bad players to gain access to it or change entries. Buyers who want to bid on the publisher’s inventory can refer to their ads.txt file and confidently know that the exchange they are dealing with is in fact authorized to directly or indirectly sell the publisher’s inventory.

How fraudsters exploit ads.txt

Ads.txt was never designed to combat the entire spectrum of ad fraud. And even within the narrow vector it addresses, there have been instances where the standard has been misused, either due to improper configuration by publishers or by intentional efforts by fraudsters.

In 2019, ad fraud vendor DoubleVerify broke news about a new bot network, which was being used to sell fake inventory.

The bot started by scraping the content of the valid publisher, then manipulating the environment to make it seem like the browser is visiting the original website. Finally, it sells counterfeit inventory—using falsified URLs—through one of the resellers listed in the original publisher’s ads.txt file.

“This scheme was specifically designed to take advantage of the industry-wide ads.txt initiative and commit fraud that would not trigger ads.txt violations with programmatic buyers,” said Roy Rosenfeld, Head of DoubleVerify’s Fraud Lab.

Then last year, Integral Ad Science exposed another fraud dubbed the 404bot, which exploited a flaw in the spec and cost advertisers $15 million in wasted ad spend.

Similar to the ad fraud operation exposed by DoubleVerify, the 404bot also used domain spoofing to generate counterfeit inventory, but with one added twist: the falsified URLs passed in the bid requests led to pages that didn’t even exist. Hence the name “404”.

IAS found a common thing among the affected publishers. “Their ads.txt files were huge,” said Evgeny Shmelkov, head of the IAS Threat Lab. “There were lots of parties freely trusted.”

None of this is to say that the spec does not work. Before the ads.txt was widely adopted by the industry, ad fraud schemes such as Hyphbot and Methbot, both unprecedented in size and scope, were taking advantage of the same lack of authorization that ads.txt now solves.

Even the more recent inventory fraud operations have relied on the publisher having too many resellers and untrustworthy entities in their ads.txt files. This brings us to the next part: What are some ads.txt best practices that publishers can use to safeguard themselves?

Ads.txt best practices for publishers

Focus on direct partnerships

Given how some recent bot networks operate, having too many resellers in your ads.txt file can make you vulnerable to ad fraud, as those resellers may not have very stringent requirements for who joins their network. Apart from that, it’s also a signal of quality.

Paul Bannister, Chief Strategy Officer of CafeMedia posed this question in a recent AdExchanger column: “When a buyer looks at your ads.txt file (either directly, or via buying platforms), do you want them seeing that you work with a curated group of premium partners? Or a flea market of junky companies that no one has heard of selling goods that no buyer wants?”

Keep your ads.txt file short

IAS reported that the 404bot was strongly correlated with how large the ads.txt files were.

If that’s not problematic enough, the IAB Tech Lab says that giant ads.txt files are either not fully scanned by DSPs, and in the worst cases, the DSP will ignore the file entirely and disallow and programmatic channels for that domain.

Any time you add a new ad revenue partner, they will ask you to add an entry to the ads.txt file. Sometimes, the new partner will ask you to add multiple direct and/or reseller entries.

At this point, you should demand transparency and ask your partner for the purpose of each additional entry before you add them to the file.

Validate the file to keep it error-free

Once you’ve reviewed your ads.txt file and cut it short by removing inactive partners and doing a quality audit, you’ll want to make sure that there are no syntax errors in the file.

DSPs will not be able to fully scan your ads.txt file if it has validation errors. The result? They will avoid your inventory and your ad revenue will plummet.

Google Ad Manager has a built-in validator that you can use to check your ads.txt file. You’ll easily find third-party validators as well. While a validator will ensure that your ads.txt file is error-free, it cannot audit or update your file for ongoing changes. Those decisions have to be manual.

Each entry in the file should provide incremental revenue, if it doesn’t, you should remove it. Case in point: Chegg cut all the resellers from its ads.txt file and saw no drop in revenue.

Don’t rely on just ads.txt

Ads.txt is not intended to be and cannot be the magic pill to ad fraud.

The IAB recommends using Sellers.json, released shortly after ads.txt and intended to be used alongside ads.txt. Sellers.json is hosted by ad exchanges and SSPs and it lists publishers that work with them, their information (name, domain), and whether the relation is direct or indirect.

By using ads.txt in conjunction with Sellers.json, buyers can cross-reference a publisher’s relationship with their ad exchange or SSP—which adds an extra layer of security to the process.

The IAB also recommends that publishers work with independent ad fraud vendors, who provide the technology to check whether ads were shown on the web page or app they was intended for.