Hello David, DCS> I want to set up a process for dumping my Zope CMF site to the DCS> filesystem, to be served by Apache. I'm interested in anyone who's doing DCS> this - what tools are you using. I'm trying Wget, but the main problem DCS> is dealing with absolute URLs. I can use the Wget --convert-links DCS> option, which removes the href attribute from the <base> tag and makes DCS> internal links relative. However, I still have a problem with folders. DCS> The absolute_url() method does not return a trailing slash for folders. DCS> Wget downloads the URL folder_name as a file called folder_name, but it DCS> downloads folder_name/ as folder_name/index.html. I have already written DCS> a relativeURL() script based on DCS> portal_url.getRelativeUrl(), but it DCS> doesn't return a trailing slash either, so I'll have to add one. Recently I've done this problem. The solution is next. 1. Make all your URLs end with slash. I did it manually, by correcting some lists in portlets, and after that I found how to redefine absolute_url() function. Please, look for it here: 2. Run wget (I'm doing it from my Zope as a reaction on some user action) but it's also could be done with shell script like below. Convert links in downloaded files, erase <base ..> tag. Also I edit html files to delete 'index.html' from links - any URL now ends with '/'. (*) If you wish you may optimize file by killing white space - I found white space takes about 30-40% of html file. 3. Publish your files. Here's the script: el@test[<<debug-1/bin]%cat mirror.sh #!/bin/sh param=$1 if test "$param" = ""; then param='-r -l 1 -i ../etc/wget-list' else param="http://www.test/$param" fi wget -v -nH -k -p -X images -x -R index_html $param for i in `find ./ -name '*.html'`; do infa=`cat $i` infa=`echo $infa|sed -e 's/href="\([a-zA-Z0-9._/-]*\)\/index.html"/href="\1\/"/g' \ -e 's/="index.html"/=".\/"/g' -e 's/<base href=""[^/]*\/>/<!--here was base tag-->/'` echo $infa > $i done ====== File wget-list contains extra files need to be downloaded: el@test[<<debug-1/bin]%cat ../etc/wget-list http://www.test/ http://www.test/xtra/head.css http://www.test/xtra/default.css http://www.test/xtra/inside.css ==== Addition: (*) - It's my mania. I hate URL with a lot of junk like http://site/print1.html?foo=bar&sid=4759436545&vasya=pupkine&junks=true¬h....... The best URL is in format as supposed Tim Bernes Lee: http://site/section/subsection/page/ -- Best regards, Eugene mailto:el-spam@yandex.ru