[Grok-dev] Tests, Unicode and Fileencoding
Uli Fouquet
uli at gnufix.de
Sat Nov 17 11:50:19 EST 2007
Hi JUH,
Jan Ulrich Hasecke wrote:
> starting to use tests while developing my app,
Good move! Go ahead :-)
> I discovered that you
> can only use unicode strings in tests, when you specified the file
> encoding of the testfile.
>
> So my zoo.txt starts with:
>
> ------snip----------
>
> # -*- coding: utf-8 -*-
>
> =========================
> The Online Game GrokZoo
> =========================
>
> (...)
>
> -----snap-----------
>
> Is that the intended behaviour?
Though I am not very into this topic, I think it is merely the (Python)
default behaviour, not the intended behaviour.
> What is the default encoding /bin/test expects?
/bin/test does not expect a certain encoding. It only looks for tests
and runs them. This is good from my point of view, because others might
prefer other encodings than utf8. I think your test setup code is to
blame instead (at least, if you have 'borrowed' it from me).
> ASCII? So why ASCII?
Registering doctests files as unittest testsuites (your example code
looks like it), often means to call the Python standard library function
``doctest.DocFileSuite()`` in the test setup. Have a look at your test
setup code.
``DocFileSuite()`` returns a ``unittest.TestSuite`` and takes the system
standard encoding as default. But you can pass an optional ``encoding``
parameter to setup the docfiles with a certain non-standard encoding.
See http://docs.python.org/lib/doctest-unittest-api.html
For example::
def test_suite():
suite = unittest.TestSuite()
for filename in DOCTESTFILES:
suite.addTest(doctest.DocFileSuite(
filename,
package=mypackagename,
setUp=setUpZope,
tearDown=cleanUpZope,
encoding='utf8', ## SET THE ENCODING HERE... ##
optionflags=doctest.ELLIPSIS+
doctest.NORMALIZE_WHITESPACE)
)
return suite
would expect your doctest files all utf8 encoded.
Marking the doctest files with `# -*- coding: utf-8 -*-` as you did,
also doesn't look like too heavy lifting to me. Interesting, that it
works :-)
Did you get ``UnicodeError`` before?
> Wouldn't utf-8 be better, since we claim to have unicode everywhere
> in Zope?
There is a difference between 'having unicode' and 'everything is utf8
encoded'. Python's current internal unicode representation for example
is UCS2 or UCS4 if I remember correctly. If you meant that every input
and output from and to 'Zope' should be utf8 (or utf16), then I happily
leave this discussion to the gurus :-)
With the grok.testing extension BTW one could setup a different standard
encoding to be expected in doctest files. This could solve that little
problem for Grok. But to be honest, I don't recognize this as a real
problem and can live without it.
Kind regards,
--
Uli
More information about the Grok-dev
mailing list