What happens if a website does not have a robots.txt file?

If the robots.txt file is missing in the root directory of a website, how are things treated as:

the site is not indexed at all
the site is indexed without any restrictions

It should logically be the second one according to me. I ask in reference to this question.

The purpose of a robots.txt file is to keep crawlers out of certain parts of your website. Not having one should result in all your content being indexed.

The implication from the first comment on that Meta question was that the robots.txt file existed but was inaccessible (for whatever reason), rather than not being there at all. That might cause the web crawlers some issues, but that's speculation.

I don't have a robots.txt on my blog (self hosted Wordpress installation) and that's indexed.

Robots.txt is a strictly voluntary convention amongst search engines; they're free to ignore it, or implement it in any way they choose. That said, barring the occasional spider looking for email addresses or the like, they pretty much all respect it. Its format and logic are very, very simple, and the default rule is allow (since you can only disallow). A site without a robots.txt will be fully-indexed.

I haven't had robots.txt on dozens of domains I've had registered, some as far back as 1994, and have never had a problem with them getting placed in google/yahoo, etc.

Even my personal website gets 150-200 users a day from google, and doesn't have a robots.txt file.

(Love the three minute pause requirement between answering questions. Next I'll get the robot captcha. Sometimes it just isn't worth trying to be helpful.)

robots.txt is completely optional. If you have one, standards-compliant crawlers will respect it, if you have none, everything not disallowed in HTML-META elements (Wikipedia) is crawlable.

What happens if a website does not have a robots.txt file?

Related

Recent Posts