Robots Refresher: page-level granularity

With the robots.txt file, site owners have a simple way to control which parts of a website are accessible by crawlers. To help site owners further express how search engines and web crawlers can use their pages, the web standards group came up with robots meta tags in 1996, just a few months after meta tags were proposed for HTML (and anecdotally, also before Google was founded). Later, X-Robots-Tag HTTP response headers were added. These instructions are sent together with a URL, so crawlers can only take them into account if they're not disallowed from crawling the URL through the robots.txt file. Together, they form the Robots Exclusion Protocol (REP).

Apr 26, 2025 - 14:52

Robots Refresher: page-level granularity

With the robots.txt file, site owners have a simple way to control which parts of a website are accessible by crawlers. To help site owners further express how search engines and web crawlers can use their pages, the web standards group came up with robots meta tags in 1996, just a few months after meta tags were proposed for HTML (and anecdotally, also before Google was founded). Later, X-Robots-Tag HTTP response headers were added. These instructions are sent together with a URL, so crawlers can only take them into account if they're not disallowed from crawling the URL through the robots.txt file. Together, they form the Robots Exclusion Protocol (REP).