[Zope] automagic bome header at start of utf16 content?
Jürgen Herrmann
Juergen.Herrmann at XLhost.de
Thu Jan 8 05:28:57 EST 2009
On Thu, January 8, 2009 11:04, Andreas Jung wrote:
> On 08.01.2009 10:33 Uhr, Jürgen Herrmann wrote:
>> i already sent the request directly to the zope server
>> omitting our apache proxy and monitored traffic with wireshark. the
>> com header comes from zope. i did not find anything in zope's code
>> that heuristically finds out this is utf16 content and prepends the
>> BOM header. so i'm a bit confused where zope takes it's wisdom from :)
>> anybody?
>
> I can not remember having seen any kind of code with the Zope core
> setting the BOM. We have code in the pagetemplate implementation
> interpreting a BOM but I have doubt that Zope sends a BOM out by itself
> (especially not for utf-16).
>
> Andreas
i wrote a small python script to check this out, content:
request = container.REQUEST
RESPONSE = request.RESPONSE
RESPONSE.setHeader('Content-Type', 'x-bom-test')
RESPONSE.setHeader('Content-Disposition', 'attachment; filename=bom_test.dat')
ustring = u'sgh sdgh\ns\xf6\xe4\xe4gddp\xe4s\n\u8a0a\u4ee5\u53ca\u76f8\u95dc\u7db2\u7d61\u670d\u52d9'
return ustring.encode('utf16')
here's what wireshark captured:
0000 00 16 17 1e 26 c6 00 1d 09 b8 cf cb 08 00 45 00 ....&... ......E.
0010 01 46 93 51 40 00 40 06 21 f6 c0 a8 01 79 c0 a8 .F.Q at .@. !....y..
0020 01 a1 1f 91 0d 2f a9 b9 5b 33 55 8e 5f 1f 50 18 ...../.. [3U._.P.
0030 1d 50 d1 0b 00 00 48 54 54 50 2f 31 2e 31 20 32 .P....HT TP/1.1 2
0040 30 30 20 4f 4b 0d 0a 53 65 72 76 65 72 3a 20 5a 00 OK..S erver: Z
0050 6f 70 65 2f 28 5a 6f 70 65 20 32 2e 31 30 2e 35 ope/(Zop e 2.10.5
0060 2d 66 69 6e 61 6c 2c 20 70 79 74 68 6f 6e 20 32 -final, python 2
0070 2e 34 2e 34 2c 20 6c 69 6e 75 78 32 29 20 5a 53 .4.4, li nux2) ZS
0080 65 72 76 65 72 2f 31 2e 31 0d 0a 44 61 74 65 3a erver/1. 1..Date:
0090 20 54 68 75 2c 20 30 38 20 4a 61 6e 20 32 30 30 Thu, 08 Jan 200
00a0 39 20 31 30 3a 32 30 3a 34 35 20 47 4d 54 0d 0a 9 10:20: 45 GMT..
00b0 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 Content- Length:
00c0 36 30 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70 65 60..Cont ent-Type
00d0 3a 20 78 2d 62 6f 6d 2d 74 65 73 74 0d 0a 43 6f : x-bom- test..Co
00e0 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f ntent-Di spositio
00f0 6e 3a 20 61 74 74 61 63 68 6d 65 6e 74 3b 20 66 n: attac hment; f
0100 69 6c 65 6e 61 6d 65 3d 62 6f 6d 5f 74 65 73 74 ilename= bom_test
0110 2e 64 61 74 0d 0a 0d 0a ff fe 73 00 67 00 68 00 .dat.... ..s.g.h.
0120 20 00 73 00 64 00 67 00 68 00 0a 00 73 00 f6 00 .s.d.g. h...s...
0130 e4 00 e4 00 67 00 64 00 64 00 70 00 e4 00 73 00 ....g.d. d.p...s.
0140 0a 00 0a 8a e5 4e ca 53 f8 76 dc 95 b2 7d 61 7d .....N.S .v...}a}
0150 0d 67 d9 52 .g.R
look at offset 0x0119...
ok, time to look at repr(ustring.encode('utf16')):
'\xff\xfes\x00g\x00h\x00 \x00s\x00d\x00g\x00h\x00\n\x00s\x00\xf6\x00\xe4\x00'\
'\xe4\x00g\x00d\x00d\x00p\x00\xe4\x00s\x00\n\x00\n\x8a\xe5N\xcaS\xf8v\xdc\x95'\
'\xb2}a}\rg\xd9R'
bam!
i din't exepct that encoding in utf16 would add a bom header by itself...
sorry for posting so lenghty, thought that it might be interesting for
people having to deal with utf16...
best regards,
jürgen herrmann
--
>> XLhost.de - eXperts in Linux hosting ® <<
XLhost.de GmbH
Jürgen Herrmann, Geschäftsführer
Boelckestrasse 21, 93051 Regensburg, Germany
Geschäftsführer: Volker Geith, Jürgen Herrmann
Registriert unter: HRB9918
Umsatzsteuer-Identifikationsnummer: DE245931218
Fon: +49 (0)700 XLHOSTDE [0700 95467833]
Fax: +49 (0)700 XLHOSTDE [0700 95467833]
WEB: http://www.XLhost.de
IRC: #XLhost at irc.quakenet.org
More information about the Zope
mailing list