[Zope] MSWordDocument and Logictran's R2NET

Johan Carlsson [EasyPublisher] johanc@easypublisher.com
Tue, 17 Dec 2002 07:45:54 +0100


At 10:16 2002-12-16 -0800, Bryan Capitano said:

> > How does Logictran's R2NET compare to vwWare (which is use by
> > MSWordDocuments on Unix)?

Thanks for a great answer Bryan,

Does R2NET on

>Johan,
>
>I had evaluated wvWare a couple months ago for a web-to-print project
>(sharing documents between a website and a printed book publication).=
 wvWare
>wasn't nearly as feature-rich or robust as R2NET.
>For example:
>1. I was not able to use wvWare to convert DOC/RTF into XML using my own
>DTD. (I can with R2NET).
>2. wvWare did not recognize some of the more complex RTF control codes for
>font "styles", tables, or anything much more complicated than plain text.=
 It
>does recognize fonts, font sizes, and italics/bold/etc. But in Word you can
>define actual styles that you can re-use or apply to sections of a=
 document.
>wvWare doesn't capture style information.
>3. In the publishing world, documents often have hidden codes embedded in
>the document. In particular, I was concerned about RTF codes \xe, \txe, and
>\tc.  In the document these look like: {xe "this looks like an index=
 code."}
>or see-also entries like this: {xe "trees" \t "See also Shrubs"}. You might
>also want to use some hidden table-of-contents codes embedded in your
>document like this: {tc "Chapter 1, Trees and Shrubs" \l 1}.  R2NET will
>extract this information from RTF documents and put them in your XML if you
>tell it HOW by using the translation files. wvWare can't do this, at least
>not to my knowledge.
>
>For these reasons, I think wvWare is a good "basic" converter. It's a good
>first step, and useful for basic doc-->html needs. But if you need more
>power and extensibility, and if you want to dump Word documents into your
>own pre-defined XML DTD, then R2NET is worth the $69 dollars.
>
>You could also write your own Perl RTF parser by making use of
>RTF::Tokenizer. I have done this too. It is a more difficult road, but=
 gives
>you absolute flexibility. There may be a similar RTF tokenizer for=
 Python???
>
>Best regards,
>Bryan
>
>
>Bryan R. Capitano
>President,
>CAPITANO WEb CONSULTING
>Tel: 541-344-0747
>Email: Bryan@capitanoweb.com
>URL: http://www.capitanoweb.com
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>_______________________________________________
>Zope maillist  -  Zope@zope.org
>http://lists.zope.org/mailman/listinfo/zope
>**   No cross posts or HTML encoding!  **
>(Related lists -
>  http://lists.zope.org/mailman/listinfo/zope-announce
>  http://lists.zope.org/mailman/listinfo/zope-dev )

--=20
Torped Strategi och Kommunikation AB
Johan Carlsson
johanc@easypublisher.com

Mail:
Birkagatan 9
SE-113 36  Stockholm
Sweden

Visit:
V=E4stmannagatan 67, Stockholm, Sweden

Phone +46-(0)8-32 31 23
Fax +46-(0)8-32 31 83
Mobil +46-(0)70-558 25 24
http://www.easypublisher.com
http://www.torped.se