Hey, although there are lots of postings about charset problems with Structured Text, I haven't found an answer to my specific problem. I have that Zope installation where a large hierarchy of documents has been created over the course of three years. Back then I created a web interface that could be used by people in various European countries to edit content in structured text format, which is then rendered onto pages automatically. In my hierarchy of folders, I established a system that will automatically insert the correct page encoding in the page's header, depending on the country (ISO-8859-1 and ISO-8859-2 are in use). This works fine as far as the content is concerned. For some reason, nobody has ever noticed before that structured text doesn't work well in this scenario. Reading postings about charset problems with structured text, the solution usually involves setting the locale for the Zope instance to the correct value. The problem is, this Zope instance contains data in different charsets, so this doesn't seem to be the solution here. Is there any solution out there that would let me set the locale (as used by the structured text parser) on a folder basis? This is plain Zope 2.5.1 from Debian Stable. Oliver Sturm -- omnibus ex nihilo ducendis sufficit unum Spaces inserted to prevent google email destruction: MSN oliver @ sturmnet.org Jabber sturm @ amessage.de ICQ 27142619 http://www.sturmnet.org/blog
Oliver Sturm wrote at 2005-2-18 12:37 +0000:
... Reading postings about charset problems with structured text, the solution usually involves setting the locale for the Zope instance to the correct value. The problem is, this Zope instance contains data in different charsets
You have a chance that StructuredText might work with Unicode correctly -- independent of locale (I am not sure, though; the chance comes from the fact that StructuredText uses regular expressions and they can handle unicode correctly (and usually transparently). Whenever you have different charsets in one application, it is always a good idea to look for Unicode! -- Dieter
Dieter Maurer wrote:
You have a chance that StructuredText might work with Unicode correctly -- independent of locale (I am not sure, though; the chance comes from the fact that StructuredText uses regular expressions and they can handle unicode correctly (and usually transparently).
Thanks, I tried that. It didn't change anything, though. I've looked at the source code and I found that the problem lies in the Python string.letters expression, which is used to build the regular expressions for the Structured Text implementation. Actually, I've been able to make modifications to the file DocumentClass.py that make things work for me. I've really simplified the expressions a bit, so I guess there may be other problems with that approach. For example, the original expression for the bold format was constructed like this: r'\*\*([%s%s%s\s]+?)\*\*' % (letters, digits, strongem_punc) I changed this to say: r'\*\*([^*]+)\*\*' Obviously, this may introduce its own problems, but then I can easily explain to my users that they can't format a string bold that contains a * character. Probably depending a little on the content that's supposed to be entered :-) Oliver Sturm -- omnibus ex nihilo ducendis sufficit unum Spaces inserted to prevent google email destruction: MSN oliver @ sturmnet.org Jabber sturm @ amessage.de ICQ 27142619 http://www.sturmnet.org/blog
Oliver Sturm wrote:
Thanks, I tried that. It didn't change anything, though. I've looked at the source code and I found that the problem lies in the Python string.letters expression, which is used to build the regular expressions for the Structured Text implementation.
Actually, I've been able to make modifications to the file DocumentClass.py that make things work for me. I've really simplified the expressions a bit, so I guess there may be other problems with that approach.
I don't think I mentioned before that I wrote a blog article on this at http://www.sturmnet.org/blog/archives/2005/03/01/zope-codepages-stx/ There's also a download there for the patch I made to DocumentClass.py, in case anybody else is interested in that. Oliver Sturm -- omnibus ex nihilo ducendis sufficit unum Spaces inserted to prevent google email destruction: MSN oliver @ sturmnet.org Jabber sturm @ amessage.de ICQ 27142619 http://www.sturmnet.org/blog
So does 2.7.5 use Python 2.4? How about the new ZODB in 2.7.5? Am still trying to find it but I thought I read conflicting statements that 2.7.5 does NOT use 2.4 but the new ZODB requires it. Just wanted to clear that up as the issues with the dead threads is supposed to be straightened out and I need that. Thanks
[Allen Schmidt]
So does 2.7.5 use Python 2.4? How about the new ZODB in 2.7.5? Am still trying to find it but I thought I read conflicting statements that 2.7.5 does NOT use 2.4 but the new ZODB requires it.
No Zope Corp software requires 2.4, and, indeed, no Zope Corp software officially supports 2.4 yet. A bit to the contrary, a security fix in the Zope 2.7 line triggered a bug in Python 2.4; that will be fixed in Python 2.4.1 (but 2.4.1 hasn't been released yet).
Just wanted to clear that up as the issues with the dead threads is supposed to be straightened out and I need that.
If I were you, and I were speaking for me, I'd use Python 2.4 regardless.
participants (5)
-
Allen Schmidt -
Andreas Jung -
Dieter Maurer -
Oliver Sturm -
Tim Peters