Illustration by Fernando Capeto for Forbes; Graphics by Cherezoff/Getty Images
Companies like OpenAI and Perplexity have made lofty claims that their AI-powered search engines, which scrape information from the web to generate summarized answers, will provide new sources of income for publishers by directing more readers to their sites. But the reality is starkly different — AI search engines send 96% less referral traffic to news sites and blogs than traditional Google search, per a new report by content licensing platform TollBit, shared exclusively with Forbes. Meanwhile, AI developers’ scraping of websites has more than doubled in recent months, the report found.
OpenAI, Perplexity, Meta and other AI companies scraped websites 2 million times on average in the fourth quarter of last year, per the report, which analyzed 160 websites including national and local news, consumer tech and shopping blogs over the last three months of 2024. Each page was scraped about seven times on average.
“We are seeing an influx of bots that are hammering these sites every time a user asks a question,” CEO Toshit Panigrahi told Forbes. “The amount of demand for publisher content is nontrivial.” TollBit, which integrates with publishers to track scraping and charge AI companies each time they do so, collected the data from publishers that have signed up on its platform for analytics, giving it insight into traffic and scraping activity on their sites.
OpenAI did not comment, and Meta did not respond to a request for comment. A Perplexity spokesperson did not address the specific claims of the report, but said the company respects “robots.txt” directives, which instruct web crawlers on which parts of a site they are allowed to access.
“It’s time to say no.”
Last February, research firm Gartner predicted that traffic from traditional search engines would drop 25% by 2026, largely due to AI chatbots and other virtual agents. Businesses that rely on search traffic have already started to take a hit. Edtech company Chegg recently sued Google, alleging that the search giant’s AI-generated summaries included content from its website without attribution, snatching away eyeballs from its site and hurting its already diminishing revenue. Chegg’s traffic plummeted 49% in January year-over-year, a sharp decline from the 8% drop in the second quarter last year, when Google released AI summaries. The traffic decline has affected Chegg to the extent that it is considering going private or getting acquired, CEO Nathan Schultz said in an earnings call.
“It’s time to say no,” Schultz told Forbes. He said Google and publishers have long had a social contract to send users to high quality content, and not just retain that traffic on Google. “When you break that contract, that is not right.”
Ian Crosby, a partner at law firm Susman Godfrey representing Chegg, said the practice will harm search companies like Google in the long run, resulting in an “AI slurry” if companies like Chegg are put out of business. “It is a threat to the internet,” he said.
Google has called Chegg’s lawsuit “meritless,” claiming that its AI search service sends traffic to a greater diversity of sites.
Travel booking sites like Kayak and TripAdvisor are also concerned about Google’s AI search overviews chipping away at traffic, Forbes reported. Meanwhile, news publishers have taken legal action against both OpenAI and Perplexity for allegedly infringing on their intellectual property. (Both companies are fighting the suits.)
AI developers use what are called user agents to crawl the web and collect data, but many don’t properly identify or disclose their scraper bots, making it difficult for website owners to uncover and understand how AI companies are accessing their content. Some, like Google, appear to use the same bots for multiple purposes, including indexing the web and scraping data fo