Google Stops Support of NoIndex Directive in Robots.txt FileSep162019
On September 1, Google officially ends support for the "noindex" directive within the robots.txt file.
You'll want to:
- Identify if your site is using the "noindex" in the "robots.txt" file.
- If it is, you'll want to switch over to another method to control crawling of specific webpages.
What is a Noindex in a Robots.txt File?
A "noindex" tag in a robots.txt file tells search engines not to include the page in search results, and is a quick and easy way to noindex many webpages at once.
Example:
Noindex: /example-page-1/
Noindex: /example-page-2/
What is a Robots.txt File?
A "robots.txt" file is a plain text file that resides within the root directory of a website. The file contains rules to instruct search engine bots how to crawl pages on a website.
Alternate Methods to Control Crawling of Webpages
Google has listed the following options:
- Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.
See: https://support.google.com/webmasters/answer/93710
- 404 and 410 HTTP status codes: Both status codes mean that the page does not exist, which will drop such URLs from Google’s index once they’re crawled and processed.
- Password protection: Unless markup is used to indicate subscription or paywalled content, hiding a page behind a login will generally remove it from Google’s index.
- Disallow in robots.txt: Search engines can only index pages that they know about, so blocking the page from being crawled often means its content won’t be indexed. While the search engine may also index a URL based on links from other pages, without seeing the content itself, we aim to make such pages less visible in the future.
- Search Console Remove URL tool: The tool is a quick and easy method to remove a URL temporarily from Google’s search results. See: https://support.google.com/webmasters/answer/1663419
If you have an older website you may not know if your site is using a "noindex" directive within the "robots.txt" file.
If you're not familiar with HTML, and/or not experienced with manipulating HTML code - you may want to reach out to us at Hosting Connecticut and request an analysis. We’ll perform "noindex" code changes where needed.
Contact HostingCT.com:
Email us at support@hostingct.com or use our online Contact Form at http://hostingct.com/
More to Explore:
Google Robots.txt Specifications:
https://developers.google.com/search/reference/robots_txt
Google Robots.txt Tester Tool:
https://support.google.com/webmasters/answer/6062598
Creating a Robots.txt File:
https://support.google.com/webmasters/answer/6062598