AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says
1 min read
Summary
Cloudflare has accused search engine Perplexity of using stealth bots to circumvent no-crawl directives.
Customers attempting to block Perplexity’s scraping bots using roborts.txt files and firewalls had limited success, as the search engine simply sent in stealth bots that concealed their activity using various tactics.
These included avoiding listing the stealth bots in Perplexity’s official IP range, and using different autonomous system numbers (ASNs) to evade website blocks.
Such tactics violate more than three decades of accepted internet norms, which were formalised in 2022 as the Robots Exclusion Protocol.
This protocol uses robots.txt files to tell crawlers that they are not allowed to index a site.
Cloudflare estimated that Perplexity’s activity was across 10,000 domains and millions of requests every day.