Cloudflare has begun blocking artificial intelligence crawlers from accessing websites on its network unless the site owner gives explicit permission. The new default setting applies to every domain that signs up for Cloudflare’s services starting this week, and affects a large swath of internet traffic. The company provides security and performance infrastructure for about 20% of the web, according to its estimates.
The move gives publishers and website owners a clearer way to control how their content is used by AI companies, many of which have built large language models by scraping the internet for articles, images, and other material often without consent or compensation.
“If you’re a content creator, and you’re creating value by either selling ads, selling subscriptions, or just getting the ego of knowing people are reading your stuff all three of those things are going away in an AI-driven web,” said Cloudflare Chief Executive Matthew Prince. “We’re giving publishers a way to stop that.”
The decision builds on a feature Cloudflare introduced in September 2024, which let customers block AI crawlers with a single click. More than one million websites have enabled that option since it launched. Now, blocking becomes the default unless users opt in.
Website owners will also be able to set terms for access, including a pay-per-crawl model that Cloudflare is developing. The company says its bot detection capabilities, built over years to fight spam and cyberattacks, can distinguish AI crawlers from regular traffic with a high degree of accuracy.
Cloudflare’s changes are aimed at the growing number of automated bots used by AI firms like OpenAI, Google, and others to gather training data. Some of those companies provide ways to opt out through a file called robots.txt, but publishers say those methods are easily ignored or misunderstood. Cloudflare’s enforcement occurs at the network level, making it more difficult to bypass.
Publisher Support
Several publishing companies have publicly backed Cloudflare’s decision. Condé Nast CEO Roger Lynch called the new settings “a game-changer,” and said they mark a shift toward “a fair value exchange” between AI companies and content producers. Dotdash Meredith CEO Neil Vogel said the company will now be able to restrict access “to those AI partners willing to engage in fair arrangements.” Gannett and Pinterest issued similar statements.
Cloudflare said demand for these protections began rising 18 months ago, when news organizations started noticing spikes in traffic from AI bots. Prince said his initial reaction was skeptical. “At first, I sort of rolled my eyes,” he said. “We spend our time worrying about Chinese hackers, Iranian hackers, Russian hackers—and you’re telling me nerds in Palo Alto are taking on your business?”
But after reviewing traffic patterns, the company saw the impact. AI bots were pulling large volumes of content with little visibility or control. In contrast to search engines, which direct users to original sites, AI-generated answers often don’t link back or generate any traffic.
AI Firms Object
Some AI companies have pushed back. OpenAI declined to participate in Cloudflare’s rollout of the default block. In a statement, the company said Cloudflare is introducing an unnecessary intermediary and noted that its GPTBot respects opt-out signals through robots.txt.
Others in the industry have argued that widespread blocking of crawlers could limit innovation and access to public knowledge. But publishers say the existing system doesn’t work. “The model is now broken,” Cloudflare wrote in a statement. “AI crawlers collect content… without sending visitors to the original source depriving content creators of revenue, and the satisfaction of knowing someone is viewing their content.”
Legal experts say Cloudflare’s approach is technically enforceable and may help shift the balance of power. “If effective, the development would hinder AI chatbots’ ability to harvest data for training and search purposes,” said Matthew Holman, a partner at UK-based law firm Cripps.
Business and Competitive Impact
Cloudflare is also treating the change as a business opportunity. Prince said the company has gained new customers especially from media and publishing because of its ability to block unwanted scraping. “It’s the number one thing their C-level executives are concerned with,” he said.
Competitors like Google and Microsoft have not implemented default blocks for AI crawlers across their services. Google allows publishers to opt out of data collection for model training, but does not enforce that choice at the infrastructure level. Microsoft, which has integrated OpenAI’s models into many of its products, has been largely quiet on enforcement tools for third-party sites.
Some AI companies, including OpenAI and Meta, have struck licensing deals with publishers such as Axel Springer and the Associated Press. But those remain exceptions. Most web content remains accessible unless site owners take specific steps to block it—something Cloudflare is now doing by default.
Cloudflare’s enforcement comes at a time when AI-related copyright disputes are moving into the courts. The New York Times is suing OpenAI and Microsoft, alleging that the companies used its material without permission. Several other publishers are evaluating legal or technical responses.
Prince said the longer-term goal is to help create a market for data access, one where publishers can charge for the use of their content, and AI firms can choose which data sources they want to license. “There’s going to be what monetizes the AI-driven web,” he said. “And if Cloudflare can be the company that helps develop that marketplace, we think it’s a substantial business opportunity.”