Tugger the SLUGger!SLUG Mailing List Archives

Re: [chat] Re: [SLUG] apache access log -- "GET /robots.txt HTTP/1.0" 404 284


begin  Terry Collins  quotation:

> I missed your humour tag. Unfortunately, many search engines ignore
> robots.txt, which is why there was a discussion on poison and similar
> revenge tactics in the archives. A good example of ones that ignore it are
> the ones harvesting email addresses.

The useful ones respect it - the ones that don't are usually not massively
published on search engine sites. Regardless of what's in your robots.txt,
you'd want to kill access for the bad spiders anyway.

(No mention of someone being horribly alarmed that the SLUG mailing list
archives were being trawled by a horrible SPAM spider called
"FASTcrawler"... OH DEAR GOD NOOOO!)

- ii

-- 
  Penguinillas Pack GNUzis