[ZWeb] Zope.org currently unusable
Mark Pratt
mark at zopemag.com
Thu Mar 10 06:16:24 EST 2005
Hi,
I recommend adding crawl delays for all but google to something like:
User-agent: Slurp
Crawl-delay: 120
This is for the yahoo bot but should also be applied to msnbot.
It's crazy how some of these bots love to hit your site at the same
time. A 120 second delay should be more than enough time between
hits even if they all come at the same time.
Cheers,
Mark
On Mar 10, 2005, at 10:33 AM, Jens Vagelpohl wrote:
>
> On Mar 10, 2005, at 2:18, Andrew Sawyers wrote:
>
>> It's a little of both; there's a group of people working on this - we
>> hope
>> to have something real soon now :) as a fix. Jens, could do you have
>> the
>> time to check the zope.org robots.txt? A lot of the problems I've
>> seen
>> recently were due to several robots spidering zope.org at a time. I'm
>> working on additional hardware and we should see more traction on the
>> project sooner then later.
>
> I don't believe all that much in robots.txt. The nasty bots completely
> ignore it, anyway. The only way to deal with them is to block them
> with e.g. iptables.
>
> What's currently there looks odd:
>
> """
> User-agent: wget
> Disallow: /
>
> User-agent: Wget
> Disallow: /
>
> # Ask Google to skip search queries and the like.
> User-agent: Googlebot
> Disallow: /*?
> """
>
> Looking at the spec the case sensitivity of the User-agent value is
> only "recommended", but you could shorten that into the following,
> because multiple User-agent values are allowed per rule set:
>
> """
> User-agent: wget
> User-agent: Wget
> Disallow: /
> """
>
> Otherwise there really isn't much in there, and from seeing googlebots
> myself often enough I have my doubts whether the line "Disallow: /*?"
> works at all.
>
> jens
>
> _______________________________________________
> Zope-web maillist - Zope-web at zope.org
> http://mail.zope.org/mailman/listinfo/zope-web
>
>
More information about the Zope-web
mailing list