[Grok-dev] z3c.testsetup versus docfilesuite encoding
Uli Fouquet
uli at gnufix.de
Wed Oct 21 08:37:14 EDT 2009
Hi there,
Theunis reply didn't make it to the list. I am quoting it therefore
comletely.
Christian Theune wrote:
> On 10/17/2009 01:03 PM, Uli Fouquet wrote:
> > Am Mittwoch, den 07.10.2009, 17:40 +0200 schrieb Christian Theune:
> >
> >> I noticed some annoyances with z3c.testsetup WRT doctest files and encoding:
> >>
> >> - the default encoding is utf-8 and can not be turned to python's system
> >> default of None because of "or"-ing the optional parameter.
> >
> > You could do::
> >
> > import sys
> > testsuite = z3c.testsetup.register_all_tests(
> > 'mymod', encoding=sys.getdefaultencoding())
>
> No, what I'm referring to is that doctest's internal default is None,
> which seems to avoid encoding/decoding alltogether (or something similar).
I wasn't aware that setting encoding to ``None`` skips decoding
completely. Thanks for the hint.
> sys.getdefaultencoding() would usually deliver 'ascii', not None. And
> even if it did, I could not pass it in. This is simply an issue of the
> testsetup API disabling behaviour of the original library due to
> shadowing issues.
>
> A simple fix on your side would be to do
>
> marker = object()
>
> def register_all_tests(... encoding=object):
> if encoding is marker:
> ...
>
> That would allow None as a valid argument.
Thanks, this will go into the next bugfix release.
> >> - the encoding is only applied to functional docfile suites, but not the
> >> ones that are unit tests
> >
> > I think that should be fixed with the latest release.
>
> Thanks.
>
> >> - can the default please be the same as Python?
> >
> > My experience is that most people that worry about encodings use
> > 'utf-8'. And they often expect 'utf-8' to be handled out-of-the-box.
> > Getting back to Python default encoding would most probably break many
> > tests and could confuse beginners (I see, that there are still many more
> > encoding-related problems with testrunners and doctests).
> >
> > As Python in general is moving towards complete 'utf-8-defaultness', I
> > don't see the point here. Maybe you want to explain your usecase?
> >
> >> Why does one care about that encoding anyway?
> >
> > Uh? For example to handle umlauts. A usecase quite common in
> > internationalized apps. With the encoding set to 'utf-8' you can do
> >
> > >>> myvar = u'ö'
> > >>> myvar
> > u'\xf6'
> >
> > which is not nice, but gives at least a bit of encoding support (for
> > example ``print myvar`` would not work, as the doctest output parser
> > seems still to expect the Python default encoding). Without setting the
> > encoding this doctest would not be accepted by the testrunner at all
> > (except you set the Python default encoding to 'utf-8') and leave
> > beginners with a cryptic error message not really related to their
> > testcase.
> >
> > As all this is not news to you, I wonder whether I missed your point. Is
> > there a better way to handle encoded strings in doctests?
>
> My point is: whatever doctest does by default already works. I need to
> revalidate this with the example you gave above, though.
Rechecked this. One difference is, when passing for instance 'utf-8'
encoding, the following doctest will work:
>>> u'ä'
u'\xe4'
while with encoding set to ``None`` it gives:
>>> u'ä'
u'\xc3\xa4'
(but it works, contrary to my first assumption). I am too little into
encodings to say what's better. My feeling is, that switching back to
the default of doctest module (``None``) has some advantages. Maybe
someone with more encoding experience can tell?
Best regards,
--
Uli
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
Url : http://mail.zope.org/pipermail/grok-dev/attachments/20091021/5b17213d/attachment.bin
More information about the Grok-dev
mailing list