Re: [Zope] Big improvement for load_site.py, patch included
On Tue, 24 Jul 2001, Peter Bengtsson wrote:
Some sanitization is done to document's ids before uploading them to the ZODB, because I encountered problems trying to upload files which names contained spaces or accented characters: each invalid character is replaced with an underscore. However this may produce invalid ids too, so perhaps a better solution is needed.
"space" isn't an illegal character.
What will happen to those cases where M$ is involved and you download a site for offline view and then decide to upload it into Zope. You might save the webpage like "peter bengtsson.html" and with it comes a folder full of images and stylesheets and they are all called using a space.
It's strange because load_site.py received a 400 HTTP error whenever I tried to upload a .htm file which name contained spaces, and when replacing the spaces with underscores all went fine.
could there be another reason I missed ?
urlencoding the space? cheers, oliver
On Tue, 24 Jul 2001, Oliver Bleutgen wrote:
could there be another reason I missed ?
urlencoding the space?
Fine ! It's probably the problem. But then why did Peter succeed where I didn't ? Or I haven't understood his message... Jerome Alet
On Tue, 24 Jul 2001, Oliver Bleutgen wrote:
could there be another reason I missed ?
urlencoding the space?
Fine ! It's probably the problem.
But then why did Peter succeed where I didn't ?
Or I haven't understood his message...
I know! Because I installed a patch that came along the list a couple of weeks ago. Can't remember why what or who did it, but that's got to be why then. (or maybe it mattered that I used Opera, if I at all used Opera)
* Jerome Alet <alet@unice.fr> [010724 15:18]:
On Tue, 24 Jul 2001, Oliver Bleutgen wrote:
could there be another reason I missed ?
urlencoding the space?
Fine ! It's probably the problem.
But then why did Peter succeed where I didn't ?
Or I haven't understood his message...
Jerome Alet
(lazily not looking at the source) I'm guessing load_site uses a POST or the FTP interface or similar, right? Space is a legal character in Zope, but not in HTTP. So even M$ websites shouldn't have spaces in their names, anyway. Having said this, I've seen some sites which do, and IE handles them OK. Typical ;-) However, there are some legal HTTP characters which are not legal Zope characters. Off the top of my head, these include '+' and '%'. load_site would need to parse document contents to fix this kind of thing, I imagine. seb
On Tue, 24 Jul 2001, seb bacon wrote:
(lazily not looking at the source)
Shame on you !-)
I'm guessing load_site uses a POST or the FTP interface or similar, right? Space is a legal character in Zope, but not in HTTP. So even M$ websites shouldn't have spaces in their names, anyway. Having said this, I've seen some sites which do, and IE handles them OK. Typical ;-)
However, there are some legal HTTP characters which are not legal Zope characters. Off the top of my head, these include '+' and '%'. load_site would need to parse document contents to fix this kind of thing, I imagine.
I don't know what load_site uses, because it's encapsulated in ZPublisher/Client.py, however I suppose the problem only occurs because of the ids (and maybe titles), so there's probably no need to parse documents' contents, only their ids/titles bye, Jerome Alet
* Jerome Alet <alet@unice.fr> [010724 16:15]:
However, there are some legal HTTP characters which are not legal Zope characters. Off the top of my head, these include '+' and '%'. load_site would need to parse document contents to fix this kind of thing, I imagine.
I don't know what load_site uses, because it's encapsulated in ZPublisher/Client.py, however I suppose the problem only occurs because of the ids (and maybe titles), so there's probably no need to parse documents' contents, only their ids/titles
I was thinking of the situation mentioned earlier, where the document called 'index.html' contains a reference to an image with a wierd name: <img src="images/_foo%20bar.gif"> That won't work, because the image name will have to have been changed to be Zope-friendly (foo_bar.gif, for example). However, I'm butting into this thread a bit without really knowing what you're talking about, so I'll shut up now ;-) seb
On Tue, Jul 24, 2001 at 05:01:17PM +0100, seb bacon wrote:
I was thinking of the situation mentioned earlier, where the document called 'index.html' contains a reference to an image with a wierd name:
<img src="images/_foo%20bar.gif">
That won't work, because the image name will have to have been changed to be Zope-friendly (foo_bar.gif, for example).
However, I'm butting into this thread a bit without really knowing what you're talking about, so I'll shut up now ;-)
No, you were right, and I was completely wrong, now I understand what you meant. I think I have to merge with Marc's version and make it handle accented characters too. bye, Jerome Alet
participants (4)
-
Jerome Alet -
Oliver Bleutgen -
Peter Bengtsson -
seb bacon