I have many websites running off of one Zope installation and so it was not too bright to put a robots.txt fill in the Zope root and let all the sites inherit it down. Today, though, I wanted to customize one of them so I did this. # Robots file for <dtml-var absolute_url> User-agent: * Disallow: /Manage Disallow: /acl_users Disallow: /images <dtml-if robot_local> <dtml-var robot_local> </dtml-if> Again, nothing hugely brilliant, but I thought it was interesting enough to publish. I will put it up on Zope.org. Cheers, BZ
I'm not sure how smart it is to put your manage and acl_users directories in robots.txt, frankly. For benevolent robots that observe the restrictions you've placed on them, this does little aside from telling them not to index your images... which most of them don't anyway. They won't index what they can't get to and whatever is password protected isn't likely to be indexed. For less benevolent visitors, however, you've essentially told them exactly where to look for your vulnerabilities. Not cool. I don't know what you're seeing, but I get visitation from malicious robots at least daily. I'd hate to think what I'd get if I actually gave away any more implementation details. I also do virtual hosting and I've been hard-pressed to improve on having a single robots.txt in the root that allows everything for everyone. Pages that need to be kept private should be password protected... everything that's not should be considered fully public. And that single file in root is trivial to override on a site-by-site basis, as desired. FWIW, Dylan At 07:59 AM 11/9/2002 -0500, you wrote:
I have many websites running off of one Zope installation and so it was not too bright to put a robots.txt fill in the Zope root and let all the sites inherit it down. Today, though, I wanted to customize one of them so I did this.
# Robots file for <dtml-var absolute_url> User-agent: * Disallow: /Manage Disallow: /acl_users Disallow: /images <dtml-if robot_local> <dtml-var robot_local> </dtml-if>
Again, nothing hugely brilliant, but I thought it was interesting enough to publish.
I will put it up on Zope.org.
Cheers, BZ
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
It is always the trade off when you tell robots what to and what not to index. I will let Zope security keep up the bad robots, but I rather good robots not even try. BZ On Saturday, November 9, 2002, at 02:03 PM, Dylan Reinhardt wrote:
I'm not sure how smart it is to put your manage and acl_users directories in robots.txt, frankly.
For benevolent robots that observe the restrictions you've placed on them, this does little aside from telling them not to index your images... which most of them don't anyway. They won't index what they can't get to and whatever is password protected isn't likely to be indexed.
For less benevolent visitors, however, you've essentially told them exactly where to look for your vulnerabilities. Not cool.
I don't know what you're seeing, but I get visitation from malicious robots at least daily. I'd hate to think what I'd get if I actually gave away any more implementation details.
I also do virtual hosting and I've been hard-pressed to improve on having a single robots.txt in the root that allows everything for everyone. Pages that need to be kept private should be password protected... everything that's not should be considered fully public. And that single file in root is trivial to override on a site-by-site basis, as desired.
FWIW,
Dylan
At 07:59 AM 11/9/2002 -0500, you wrote:
I have many websites running off of one Zope installation and so it was not too bright to put a robots.txt fill in the Zope root and let all the sites inherit it down. Today, though, I wanted to customize one of them so I did this.
# Robots file for <dtml-var absolute_url> User-agent: * Disallow: /Manage Disallow: /acl_users Disallow: /images <dtml-if robot_local> <dtml-var robot_local> </dtml-if>
Again, nothing hugely brilliant, but I thought it was interesting enough to publish.
I will put it up on Zope.org.
Cheers, BZ
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
From: "BZ" <bz@bwanazulia.com>
It is always the trade off when you tell robots what to and what not to index. I will let Zope security keep up the bad robots, but I rather good robots not even try.
True, but how would a good robot even try /acl_users? It doesn't know it exists, unless you link to it from your frontpage. :-) Robots follow the links they can see. If you don't tell them about acl_users, they won't look there.
Yes, it's a tradeoff. But you're balancing unknown (possibly substantial) risks against little perceivable benefit. Robots.txt is something of a relic from a simpler time... like finger plans or unsecured rlogin. If you want to make something secure, it's a poor strategy to provide more information than you have to and depend on the visitor to act responsibly with it. Far better you should take appropriate measures to secure what you want to protect. Robots.txt would have been far more useful had it been written as an *inclusion* protocol rather than an *exclusion* protocol... but then again, 1994 was a simpler time for the web. It seemed simpler to do it this way, and that was that. YMMV, of course... security measures should scale with your actual requirements. Perhaps I'm just paranoid. Dylan At 02:42 PM 11/9/2002 -0500, you wrote:
It is always the trade off when you tell robots what to and what not to index. I will let Zope security keep up the bad robots, but I rather good robots not even try.
BZ
On Saturday, November 9, 2002, at 02:03 PM, Dylan Reinhardt wrote:
I'm not sure how smart it is to put your manage and acl_users directories in robots.txt, frankly.
For benevolent robots that observe the restrictions you've placed on them, this does little aside from telling them not to index your images... which most of them don't anyway. They won't index what they can't get to and whatever is password protected isn't likely to be indexed.
For less benevolent visitors, however, you've essentially told them exactly where to look for your vulnerabilities. Not cool.
I don't know what you're seeing, but I get visitation from malicious robots at least daily. I'd hate to think what I'd get if I actually gave away any more implementation details.
I also do virtual hosting and I've been hard-pressed to improve on having a single robots.txt in the root that allows everything for everyone. Pages that need to be kept private should be password protected... everything that's not should be considered fully public. And that single file in root is trivial to override on a site-by-site basis, as desired.
FWIW,
Dylan
At 07:59 AM 11/9/2002 -0500, you wrote:
I have many websites running off of one Zope installation and so it was not too bright to put a robots.txt fill in the Zope root and let all the sites inherit it down. Today, though, I wanted to customize one of them so I did this.
# Robots file for <dtml-var absolute_url> User-agent: * Disallow: /Manage Disallow: /acl_users Disallow: /images <dtml-if robot_local> <dtml-var robot_local> </dtml-if>
Again, nothing hugely brilliant, but I thought it was interesting enough to publish.
I will put it up on Zope.org.
Cheers, BZ
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
participants (3)
-
BZ -
Dylan Reinhardt -
Lennart Regebro