[ZPT] response.setHeader() best practice
Alan Kennedy
zpt@xhaus.com
Wed, 10 Apr 2002 09:41:55 EST5EDT
[Alan wrote]
>>Moreover, I believe the most efficient way to deal with
>>multiple character encodings is to decide the character
>>encoding before template evaluation begins, and transcode
>>the output of the template as it is generated.
[Toby wrote]
> I dont know about *most* efficient.... all of the
proposed solutions seem to
> be efficient enough. Even done badly, its hard to waste
processor time on
> something as simple as character encoding.
I wrote that carelessly. I didn't just mean CPU effiency. I
also meant memory effiency. But I mainly meant code
cleanliness and separation (see below about output chains).
I agree that character encoding does not have high CPU cost.
[Toby wrote]
> However, your approach signficantly reduces flexibility.
It assumes that the
> textual output will always be directly squashed into 8
bits.
Mmmmm, I don't think so. I do assume that it will be
transcoded to whatever encoding the user expects. That
doesn't necessarily mean 8 bits though. If the user
states "Accept-charset: utf-16" then that is indeed what
they will get.
> It is not
> uncommon to perform further textual processing on the
output of presentation
> logic, and having the output as unicode is the most
appropriate option.
I should have explained myself more clearly. I am working
in Jython and Java, so everything is already stored in UTF-
16. If I want to keep the output in UTF-16, then I just
specify an output encoding of UTF-16, i.e. no transcoding.
Or change it to UTF-16LE or UTF-16BE, or whatever
other "global coverage" character set.
So further processing is not precluded in my model.
But I am at a loss to think of any "further textual
processing" I might want to do. The only things I can think
of that I would want to do are packet related, not content
related, e.g.
GZip compression
MD5 "checksum"
Etc.
There is other processing that I do, e.g. URL rewriting,
HTML element and attribute minimisation, etc, but I do all
that before the output hits the output transcoder, because
the output from my templates is SAX2 events, which pass
through chains of SAX2 filters, with the output/transcoding
step left until very last.
Keeping the output as structured SAX2 events until the very
last minute eliminates the need for "further textual
processing", i.e. I don't have to go parsing textual HTML
looking for a/@href and form/@action attributes to modify.
[Toby wrote]
> There are other areas in Zope where a method has this
dual personality. Its
> main job is to calculate a string, perform a management
task, or whatever,
> but it also has to do something extra with the REQUEST or
RESPONSE if it is
> the top level published method.
...
> Merging the two personalities I mentioned above would be
traditional for
> Zope, but it has proved to be problematic. Separation is
good.
Getting back to the thread title "response.setHeader() best
practice", I think we've clearly established that "best
practice" is relative :-)
The reason why I posted originally was because I didn't
like seeing content meta-data being set from within the
content.
But that all happens because of CPU efficiency concerns,
i.e. does one parse or not parse the document before it is
transmitted? I think that the original intention of
html/meta/@http-equiv was to allow the server to work out
the character encoding, etc, by parsing the document if it
wished, so that the correct HTTP headers could be set.
If you're in the real world, and don't want to waste the
CPU cycles parsing the document, then giving the programmer
a shortcut is a good idea. Whether that shortcut is
1. a function call or
2. a declared attribute that gets passed to a function call
is of minor (philosophical) concern.
But I still stand by my processing model, which avoids both
approaches :-)
Regards, and thanks for the discussion. It clarified my
thoughts a great deal.
Alan.
---------------------------------------------
This message was sent using WebMail by CyberGate.
http://www.gate.net/webmail