[Zope-CMF] portal_transformation notes
seb bacon
seb@jamkit.com
Thu, 23 Jan 2003 12:18:41 +0000
Chris Withers wrote:
> seb bacon wrote:
>> Then you can convert a word document to html to structured text, etc.
>> (That'll be a common use case, then ;-)
>
>
> How far have you got on this? :-)
Well, for current purposes, I just have to convert a few MS docs into
text, and can't justify the extra time required to make it really
generic; but I've been playing with different "pluggable" designs as I go.
The code I've written so far is basically some "use an external tool to
produce output" stuff (which also works when the tool produces more than
one bit of output e.g. html + images) with a fairly generic framework.
But it's not a tool and it doesn't chain transformations together
automatically, and the conversion logic is hardwired into the File type.
I probably won't get a chance to make this tool either, but I have
been thinking about it.
One thing I'm not clear on is how I would produce transformation chains
automatically. I've not really thought about it a lot, but here are
some starting ideas. A transformer plugin will register inputs and
outputs using mime-types:
STXTransformer:
_inputs = {'text/x-structured-text':10,
'text/plain'}
_outputs = {'text/html':10,
'text/plain':10,
'text/x-structured-text':10}
PDFTransformer:
_inputs = {'text/plain':7,
'application/postscript':10,
'text/html':6,
'application/pdf':10
}
_outputs = {'application/pdf':10,
'text/plain':9}
HTMLTransformer:
_inputs = {'text/html':10,
'text/plain':8,
'application/pdf':5}
_outputs = {'text/html':10,
'text/plain':7}
Since different people may write different plugins, there could be
several different routes for the tool to choose to convert html to a
pdf. In the above example, I could go:
html -> HTMLTransformer -> plain -> PDFTransformer -> PDF
or:
html -> PDFTransformer -> PDF
Furthermore, I could convert HTML to text using a STXTransformer,
without ever using STX at all!
The values in the dictionaries are weightings to allow you to chain
together the most efficient set of transformers.
Any thoughts, additions, problems?
seb