Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is doing the rounds on the net too:

http://www.thegatesnotes.com/robots.txt

It doesn't mention Bing. Amusing, and useless.




It's a cut and paste from some sample file, so this doesn't indicate anything much.

google search for the text in the comments ( http://tinyurl.com/ygtuanh ) has 49 hits, there would be more if I had not joined the lines together, I'm sure.


Why is it useless (I've never worked with robots.txt)? Are the /css and /js files not large enough to bother skipping?


Look at the comments...

# robots.txt # # This file is to prevent the crawling and indexing of certain parts # of your site by web crawlers and spiders run by sites like Yahoo! # and Google. By telling these "robots" where not to go on your site, # you save bandwidth and server resources.


Useless in the sense it doesn't really need to mention Bing.

Blocking /css and /js makes sense because they don't add value in being crawled. Not necessary, but doesn't hurt.

Best practice: Always have a robots.txt file, even if it's empty.


What is the reason for your Best Practice advice?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: