Cloudflare, one of the largest network infrastructure companies in the world, has announced ai Labyrinth, a new tool to combat web drag bots that scrape the sites for the training data of ai without permission. The company says in <a target="_blank" href="https://blog.cloudflare.com/ai-labyrinth/”>A blog post That when it detects the “inappropriate bot behavior”, the free option tool attracts trackers by a path of links to the lure pages generated by ai that “slow down, confuse and waste the resources” of those who act in bad faith.
The websites have long used the approach of the Robots.txt system, a text file that gives or denies permission for scrapers, but that ai companies, including those known as anthropic and perplexity, have been accused of ignoring. Cloudflare writes that he sees more than 50 billion applications for web tracers per day, and although he has tools to detect and block malicious, this often leads the attackers to change tactics in “an endless arms race.”
Cloudflare says that instead of blocking the bots, ai Labyrinth struggles by processing data that have nothing to do with the real data of a specific website. The company says it also works as “a next -generation honeypot”, attracting the ai trackers that continue to follow the links to the deepest false pages, while a regular human being would not. He says that this makes it easier to make malicious bots footprints for the bad actors of Cloudflare, as well as identify “new patterns of bots and firms” that would not have detected otherwise. According to the publication, these links should not be visible to human visitors.
You can read more about how ai Labyrinth works on the Cloudflare blog, but here there is a little more detail of the publication:
We found that first generating a different set of issues, then creating content for each topic, produced more varied and convincing results. It is important for us that we do not generate inaccurate content that contributes to the spread of erroneous information on the Internet, so the content we generate is real and related to scientific facts, but it is not relevant or owner of the site that is being dragged.
Website administrators can choose to use ai Labyrinth sailing to the Bot Management Section of the Cloudflare board configuration of their site and alternating it. The company says that this “is only the first iteration of the use of generative to frustrate the bots.” Plan to create “whole networks of URL linked” that the bots in which they end will have difficulty calming it as false. As <a target="_blank" href="https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/”>Ars Technica gradesai Labyrinth sounds similar to Nepenthes, a tool designed to mark trackers during the “months” in a junk data generated by ai.
(Tagstotranslate) ai