Surprising what is the use of Robots.txt file in your website?
I have seen a lot of confusions related to robots.txt file, & this creates SEO issues on your website. I will share everything you need to know about robots.txt file with this article, and also I like share some links which will help you to fall deep into this topic.If you browse Google Webmaster forum, you will see FAQ’s like:
- Why Google not de-indexing certain part of my blog, where I have added No-index Tag..?
- Why my blog crawl rate is slow..?
- And Why my deep links are not getting indexed..?
- Why Google is indexing my admin folders..?
Maybe you can use WordPress, Drupal (or) any other platform, Robots.txt is a common universal standard for websites & it locates at the root of a domain. For an example, domain.com/Robots.txt
Now, you must besurprising, what is the Robots.txt file, how to create for your website & how to use it for search engine optimization (SEO)?
I have explained few questions here & here you will learn about tech-side of robots.txt file.
What is the use of Robots.txt file on a Website?
I will start explain from the basics, all the Search Engines have bots to crawl any website. Crawling & indexing are two different terms. When a search engine bot (Bing bot, Google bot, 3rd party Search Engine Crawlers) come to your site by following a link (or) following site map link that you have submitted in webmaster dashboard, they follow all the links on your blog to crawl & index your website.
Now, these two files Robots.txt & Sitemap.xml, located at the root of your domain. As I mentioned, bots follow robots.txt rules, to complete the crawling of your website.
Usage of robots.txt file:
When a search engine bots come onto your blog, they have very limited resources to crawl and index your website. If they can’t crawl all the pages on your site in provided resources, they will stop crawling your website & this will hamper your indexing. Also, at the same time, there are so many parts of your website, that you don’t want search engine bots to crawl. For example, Wp-admin folder, WordPress admin dashboard (or) any other pages, which are not useful for search engines. Using robots.txt, you are directing search engine crawlers (bots), to not crawl such king of area of your website. This will not only speed up crawling your blog but will also help in deep crawling of your inner pages.
The biggest misconception about Robots.txt file is that people use it for No indexing. Remember, Robots.txt file is not for Do-index (or) No-index, it’s just to direct search engine bots to stop crawling some part of your blog. For example, if you look at NewTechnology Robots.txt file (WordPress platform), you will clearly understand, what part of my blog I don’t want search engine bots to crawl.
How to check your Robots.txt file?
As I explained, Robots.txt file located at the root of your domain. You can check your domain robots.txt file at www.domain.com/robots.txt. In most of the cases ( especially in WordPress platform), you will see a blank robots.txt file. You can check your domain Robots.txt file using Google webmaster tool >> Under site configuration >> Crawler Access.
The basic structure of your robots.txt to avoid duplicate content should be like this
This will prevent robots to crawl your admin folder followed by comment feeds, pages, feeds, trackbacks & comments. Do remember, Robots file only stops crawling but doesn’t prevent indexing. Google uses No-index tag for not indexing posts (or) pages of your blog. You can use WordPress SEO by Yoast to add No-index in any individual posts (or) a part of your blog. For effective SEO of your domain, Website, blog , I suggest you to keep your category, tag pages as No-index but do-follow. You can check my page NewTechnology robots file here.
- Robots.txt file is just used to stop crawling certain part of your blog.
- This file should not be used for No-indexing instead, No-index meta tag should be used.
Note: If you are trying to de-index certain part of your blog, which is already indexed, don’t use Robots.txt to block access to that same part. This will prevent bots to crawl that part of your blog & see the updated No-index tag.
Let me know if you are using robots.txt file with your WordPress blog? If you have any questions about Robots.txt file, let me know. We are happy to help you.