[ZWeb] Zope.org currently unusable

Mark Pratt mark at zopemag.com
Thu Mar 10 06:16:24 EST 2005


Hi,

I recommend adding crawl delays for all but google to something like:

User-agent: Slurp
Crawl-delay: 120

This is for the yahoo bot but should also be applied to msnbot.

It's crazy how some of these bots love to hit your site at the same 
time. A 120 second delay should be more than enough time between
hits even if they all come at the same time.

Cheers,

Mark


On Mar 10, 2005, at 10:33 AM, Jens Vagelpohl wrote:

>
> On Mar 10, 2005, at 2:18, Andrew Sawyers wrote:
>
>> It's a little of both; there's a group of people working on this - we 
>> hope
>> to have something real soon now :) as a fix.  Jens, could do you have 
>> the
>> time to check the zope.org robots.txt?  A lot of the problems I've 
>> seen
>> recently were due to several robots spidering zope.org at a time.  I'm
>> working on additional hardware and we should see more traction on the
>> project sooner then later.
>
> I don't believe all that much in robots.txt. The nasty bots completely 
> ignore it, anyway. The only way to deal with them is to block them 
> with e.g. iptables.
>
> What's currently there looks odd:
>
> """
> User-agent: wget
> Disallow: /
>
> User-agent: Wget
> Disallow: /
>
> # Ask Google to skip search queries and the like.
> User-agent: Googlebot
> Disallow: /*?
> """
>
> Looking at the spec the case sensitivity of the User-agent value is 
> only "recommended", but you could shorten that into the following, 
> because multiple User-agent values are allowed per rule set:
>
> """
> User-agent: wget
> User-agent: Wget
> Disallow: /
> """
>
> Otherwise there really isn't much in there, and from seeing googlebots 
> myself often enough I have my doubts whether the line "Disallow: /*?" 
> works at all.
>
> jens
>
> _______________________________________________
> Zope-web maillist  -  Zope-web at zope.org
> http://mail.zope.org/mailman/listinfo/zope-web
>
>



More information about the Zope-web mailing list