At 10:16 2002-12-16 -0800, Bryan Capitano said:
How does Logictran's R2NET compare to vwWare (which is use by MSWordDocuments on Unix)?
Thanks for a great answer Bryan, Does R2NET on
Johan,
I had evaluated wvWare a couple months ago for a web-to-print project (sharing documents between a website and a printed book publication). wvWare wasn't nearly as feature-rich or robust as R2NET. For example: 1. I was not able to use wvWare to convert DOC/RTF into XML using my own DTD. (I can with R2NET). 2. wvWare did not recognize some of the more complex RTF control codes for font "styles", tables, or anything much more complicated than plain text. It does recognize fonts, font sizes, and italics/bold/etc. But in Word you can define actual styles that you can re-use or apply to sections of a document. wvWare doesn't capture style information. 3. In the publishing world, documents often have hidden codes embedded in the document. In particular, I was concerned about RTF codes \xe, \txe, and \tc. In the document these look like: {xe "this looks like an index code."} or see-also entries like this: {xe "trees" \t "See also Shrubs"}. You might also want to use some hidden table-of-contents codes embedded in your document like this: {tc "Chapter 1, Trees and Shrubs" \l 1}. R2NET will extract this information from RTF documents and put them in your XML if you tell it HOW by using the translation files. wvWare can't do this, at least not to my knowledge.
For these reasons, I think wvWare is a good "basic" converter. It's a good first step, and useful for basic doc-->html needs. But if you need more power and extensibility, and if you want to dump Word documents into your own pre-defined XML DTD, then R2NET is worth the $69 dollars.
You could also write your own Perl RTF parser by making use of RTF::Tokenizer. I have done this too. It is a more difficult road, but gives you absolute flexibility. There may be a similar RTF tokenizer for Python???
Best regards, Bryan
Bryan R. Capitano President, CAPITANO WEb CONSULTING Tel: 541-344-0747 Email: Bryan@capitanoweb.com URL: http://www.capitanoweb.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Torped Strategi och Kommunikation AB Johan Carlsson johanc@easypublisher.com Mail: Birkagatan 9 SE-113 36 Stockholm Sweden Visit: Västmannagatan 67, Stockholm, Sweden Phone +46-(0)8-32 31 23 Fax +46-(0)8-32 31 83 Mobil +46-(0)70-558 25 24 http://www.easypublisher.com http://www.torped.se