[Zope] wget of a zope site
Paul Winkler
pw_lists at slinkp.com
Sun Feb 6 15:31:56 EST 2005
On Sun, Feb 06, 2005 at 05:15:50PM +0100, Roger Oberholtzer wrote:
> That is not what I am doing. The site is currently a happy dynamic Zope
> site. It is just that the site owners want to move elsewhere and no
> longer want the Zope site. But they want the existing content to put in
> their new static boring site. This my use of wget.
It should "just work". Having no knowledge of how your Zope
site is put together, I for one have no idea what could be wrong.
wget has a lot of options that are worth exploring.
For producing a locally browsable static
copy of zope and CMF content, I eventually settled on this,
which changes some file extensions and rewrites links to point
to the local version:
wget -nc -r -l8 -p -nH --no-parent --convert-links --html-extension
It's not perfect, as for a folder named "foo" you may end up with both
foo.html and foo/index_html.html, both having the same content.
It also helps if you don't have runaway URLs:
i.e. relative links in your navigation that lead to wget traversing
the same object over and over with URLs like
http://foo/bar/baz/baz/baz/baz/baz/baz/...
> Another interesting thing about using wget with the Zope site is what
> happens if you have a calendar a ?a Plone. The links to each year are
> followed on and on. And, as each year is at the same level in the
> hierarchy, the level limiting for wget has no effect. What happens is
> that wget can run forever, following the years in the calendar.
Maybe some work on robots.txt could help with this?
Don't know.
--
Paul Winkler
http://www.slinkp.com
More information about the Zope
mailing list