[Zope-dev] how to download a entire ZWiki site?
Gregor Hoffleit
gregor@hoffleit.de
Wed, 24 May 2000 17:18:40 +0200
On Mon, May 22, 2000 at 07:28:09PM +0400, Jephte CLAIN wrote:
> Where I work, I do not have access to the internet, and as such, I have
> to move software and docs back and forth to my office. I understand that
> it is better for collaborative work to use ZWikis, but I wonder, how do
> I download the entire site to view it offline???
Hmm, offline browsing and Wikis don't mix very well.
For structured wikis, I had some success with wget, grabbing the root
"Entire wiki Contents" page like this:
wget -r -l1 http://www.zope.org/Products/PTK/ZWiki/FrontPage/map
This should traverse all the wiki pages. This won't get things like the
"backlinks" or "offsprings" pages, though, but it's enough to browse through
a local copy of the pages.
Here we're running in the problem of mapping Zope's traversal paths onto a
filesystem: Using "wget -r -l2" won't work, since e.g.
"http://www.zope.org/Products/PTK/ZWiki/HowDoIEdit" is an HTML document as
well as a directory (like in
"http://www.zope.org/Products/PTK/ZWiki/HowDoIEdit/backlinks"). Now we
needed a Zope-aware wget that would save the HTML document as e.g.
.../HowDoIEdit/index.html
Anyway, if you use this to fill up a web proxy cache (e.g. wwwoffle), that
knows how to handle these cases, it works quite nice, even with -l2.
I noticed another problem with wget and Zope: ZServer doesn't issue a
Last-modified header, therefore incrementely updating the pages using
timestamps (-N) fails:
freefly;129> wget -N -r -l1 http://www.zope.org/Products/PTK/ZWiki/FrontPage/map
--17:08:06--
http://www.zope.org/Products/PTK/ZWiki/FrontPage/map
=> www.zope.org/Products/PTK/ZWiki/FrontPage/map'
Connecting to www.zope.org... connected!
HTTP request sent, awaiting response... 200 OK
Length: 1,745 [text/html]
0K -> . [100%]
Last-modified header missing -- time-stamps turned off.
17:08:06 (41.56 KB/s) - www.zope.org/Products/PTK/ZWiki/FrontPage/map' saved
[1745/1745]
Gregor