Hello Zopistas, I've been wrestling with Eric Barroca's MSWordDocument product for days now, trying to get the HTML version of a Word file to output on my DTML document. I've tried: <dtml-var name="MyWordDoc"> but it seem to only print "MSWordDocument instance at... ". I also went to the product's source and saw a method that looks like what I needed, getDocHTML. So I tried: <dtml-var expr="MyWordDoc.getDocHTML()"> but now it's giving me an "AttributeError". I'm relatively new to Zope and Python so I would really appreciate it if any of you more experienced Zope people could care to help. -- Angelo Abarentos, a Zope newbie
With a brand new plone I asked, as I'm not logged http://danielle.zettai.net/New_plone/index_html/manage_edit I keep waiting ... the window of Iexplore 6.0 doesn't respond anymore I notice that the title of the requested page is http://danielle.zettai.net/New_plone/login_form?came_from=http:3A//danielle.... if I asked in an another window http://danielle.zettai.net/New_plone/login_form?came_from=http://danielle.ze... it does work.... as it should be.. I'm asked to sign in ... meanwhile I have no such problems with Mozilla I looked at the source and I can see "irregularities" <a href=" http://... with a space between " and http OK I am anyway ready to give up Plone... I encountered too many problems already But just to be reassured... I looked at the rendered html given by a new Zope CMF portal (noname) and I can see as well, only in a smaller number (1 for the index_html), the same fault..... first it doesn't seem to have any consequences.. the link <a href=" http://danielle.zettai.net/noname/join_form">Join</a> is working... but if I try to import the URL http://danielle.zettai.net/noname/ in HTml toolkit, it is turn not to answered... at the contrary http://danielle.zettai.net/noname/index_html will work and would also also http://danielle.zettai.net/New_plone/ here the comment of Tidy for Zope CMF portal index_html "URIs must be properly escaped, they must not contain unescaped characters below U+0021 including the space character and not above U+007E. Tidy escapes the URI for you as recommended by HTML 4.01 section B.2.1 and XML 1.0 section 4.2.2. Some user agents use another algorithm to escape such URIs and some server-sided scripts depend on that. If you want to depend on that, you must escape the URI by your own. For more information please refer to http://www.w3.org/International/O-URL-and-ident.html"
Solved thanks to geoff@geoffdavis.net and george@zettai.net!! under some circumstances not yet clear, Iexplore 6.0 and Windows XP 6.02 the blank between " and http in <a href=" http://votredomaine.net produced in the rendering of the index_html of the root of a plone site can make it inacccessible... This blank make the URI malformed as they are not properly escaped, (they must not contain unescaped characters below U+0021 including the space character and not above U+007E.) The offending blank is after "string:" in the fields of the Actions tabs of portal_actions, _syndication, _properties , or _undo ... ; just remove them using Mozilla if you have the same problem than me with Iexplore. (the correctiosn made It can be necessary to reboot in order that Iexplore can work properly) ----- Original Message ----- From: danielle.d-avout To: Zope Sent: Monday, December 16, 2002 2:08 PM Subject: [Zope] Plone and Zope: space character in URI..... With a brand new plone I asked, as I'm not logged http://danielle.zettai.net/New_plone/index_html/manage_edit I keep waiting ... the window of Iexplore 6.0 doesn't respond anymore I notice that the title of the requested page is http://danielle.zettai.net/New_plone/login_form?came_from=http:3A//danielle.... if I asked in an another window http://danielle.zettai.net/New_plone/login_form?came_from=http://danielle.ze... it does work.... as it should be.. I'm asked to sign in ... meanwhile I have no such problems with Mozilla I looked at the source and I can see "irregularities" <a href=" http://... with a space between " and http OK I am anyway ready to give up Plone... I encountered too many problems already But just to be reassured... I looked at the rendered html given by a new Zope CMF portal (noname) and I can see as well, only in a smaller number (1 for the index_html), the same fault..... first it doesn't seem to have any consequences.. the link <a href=" http://danielle.zettai.net/noname/join_form">Join</a> is working... but if I try to import the URL http://danielle.zettai.net/noname/ in HTml toolkit, it is turn not to answered... at the contrary http://danielle.zettai.net/noname/index_html will work and would also also http://danielle.zettai.net/New_plone/ here the comment of Tidy for Zope CMF portal index_html "URIs must be properly escaped, they must not contain unescaped characters below U+0021 including the space character and not above U+007E. Tidy escapes the URI for you as recommended by HTML 4.01 section B.2.1 and XML 1.0 section 4.2.2. Some user agents use another algorithm to escape such URIs and some server-sided scripts depend on that. If you want to depend on that, you must escape the URI by your own. For more information please refer to http://www.w3.org/International/O-URL-and-ident.html"
I have not tried the MSWordDocument product before. That's interesting, thanks for sharing it. I am familiar with a commercial product from Logictran called 'R2NET'. With this software you can easily convert Word (RTF) files to HTML or XHTML or XML. I use the product extensively at the Linux command line. It is easy to use, very powerful and robust. It gives you lots of control over how documents are converted through a translation file which you can customize if you want more custom output. I think it would be easy to plug into Zope. Bryan
Subject: [Zope] MSWordDocument
Hello Zopistas,
I've been wrestling with Eric Barroca's MSWordDocument product for days now, trying to get the HTML version of a Word file to output on my DTML document.
I've tried:
<dtml-var name="MyWordDoc">
but it seem to only print "MSWordDocument instance at... ".
I also went to the product's source and saw a method that looks like what I needed, getDocHTML. So I tried:
<dtml-var expr="MyWordDoc.getDocHTML()">
but now it's giving me an "AttributeError".
I'm relatively new to Zope and Python so I would really appreciate it if any of you more experienced Zope people could care to help.
-- Angelo Abarentos, a Zope newbie
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
At 07:55 2002-12-16 -0800, Bryan Capitano said:
I have not tried the MSWordDocument product before. That's interesting, thanks for sharing it. I am familiar with a commercial product from Logictran called 'R2NET'. With this software you can easily convert Word (RTF) files to HTML or XHTML or XML. I use the product extensively at the Linux command line. It is easy to use, very powerful and robust. It gives you lots of control over how documents are converted through a translation file which you can customize if you want more custom output. I think it would be easy to plug into Zope. Bryan
How does Logictran's R2NET compare to vwWare (which is use by MSWordDocuments on Unix)? It seems like they are quite similar. Regards, Johan Carlsson -- Torped Strategi och Kommunikation AB Johan Carlsson johanc@easypublisher.com Mail: Birkagatan 9 SE-113 36 Stockholm Sweden Visit: Västmannagatan 67, Stockholm, Sweden Phone +46-(0)8-32 31 23 Fax +46-(0)8-32 31 83 Mobil +46-(0)70-558 25 24 http://www.easypublisher.com http://www.torped.se
-----Original Message----- From: zope-admin@zope.org [mailto:zope-admin@zope.org]On Behalf Of Johan Carlsson [EasyPublisher] Sent: Monday, December 16, 2002 8:12 AM To: Bryan Capitano Cc: zope@zope.org Subject: RE: [Zope] MSWordDocument and Logictran's R2NET
At 07:55 2002-12-16 -0800, Bryan Capitano said:
I have not tried the MSWordDocument product before. That's interesting, thanks for sharing it. I am familiar with a commercial product from Logictran called 'R2NET'. With this software you can easily convert Word (RTF) files to HTML or XHTML or XML. I use the product extensively at the Linux command line. It is easy to use, very powerful and robust. It gives you lots of control over how documents are converted through a translation file which you can customize if you want more custom output. I think it would be easy to plug into Zope. Bryan
How does Logictran's R2NET compare to vwWare (which is use by MSWordDocuments on Unix)? It seems like they are quite similar.
Regards, Johan Carlsson
Johan, I had evaluated wvWare a couple months ago for a web-to-print project (sharing documents between a website and a printed book publication). wvWare wasn't nearly as feature-rich or robust as R2NET. For example: 1. I was not able to use wvWare to convert DOC/RTF into XML using my own DTD. (I can with R2NET). 2. wvWare did not recognize some of the more complex RTF control codes for font "styles", tables, or anything much more complicated than plain text. It does recognize fonts, font sizes, and italics/bold/etc. But in Word you can define actual styles that you can re-use or apply to sections of a document. wvWare doesn't capture style information. 3. In the publishing world, documents often have hidden codes embedded in the document. In particular, I was concerned about RTF codes \xe, \txe, and \tc. In the document these look like: {xe "this looks like an index code."} or see-also entries like this: {xe "trees" \t "See also Shrubs"}. You might also want to use some hidden table-of-contents codes embedded in your document like this: {tc "Chapter 1, Trees and Shrubs" \l 1}. R2NET will extract this information from RTF documents and put them in your XML if you tell it HOW by using the translation files. wvWare can't do this, at least not to my knowledge. For these reasons, I think wvWare is a good "basic" converter. It's a good first step, and useful for basic doc-->html needs. But if you need more power and extensibility, and if you want to dump Word documents into your own pre-defined XML DTD, then R2NET is worth the $69 dollars. You could also write your own Perl RTF parser by making use of RTF::Tokenizer. I have done this too. It is a more difficult road, but gives you absolute flexibility. There may be a similar RTF tokenizer for Python??? Best regards, Bryan Bryan R. Capitano President, CAPITANO WEb CONSULTING Tel: 541-344-0747 Email: Bryan@capitanoweb.com URL: http://www.capitanoweb.com
At 10:16 2002-12-16 -0800, Bryan Capitano said:
How does Logictran's R2NET compare to vwWare (which is use by MSWordDocuments on Unix)?
Thanks for a great answer Bryan, Does R2NET on
Johan,
I had evaluated wvWare a couple months ago for a web-to-print project (sharing documents between a website and a printed book publication). wvWare wasn't nearly as feature-rich or robust as R2NET. For example: 1. I was not able to use wvWare to convert DOC/RTF into XML using my own DTD. (I can with R2NET). 2. wvWare did not recognize some of the more complex RTF control codes for font "styles", tables, or anything much more complicated than plain text. It does recognize fonts, font sizes, and italics/bold/etc. But in Word you can define actual styles that you can re-use or apply to sections of a document. wvWare doesn't capture style information. 3. In the publishing world, documents often have hidden codes embedded in the document. In particular, I was concerned about RTF codes \xe, \txe, and \tc. In the document these look like: {xe "this looks like an index code."} or see-also entries like this: {xe "trees" \t "See also Shrubs"}. You might also want to use some hidden table-of-contents codes embedded in your document like this: {tc "Chapter 1, Trees and Shrubs" \l 1}. R2NET will extract this information from RTF documents and put them in your XML if you tell it HOW by using the translation files. wvWare can't do this, at least not to my knowledge.
For these reasons, I think wvWare is a good "basic" converter. It's a good first step, and useful for basic doc-->html needs. But if you need more power and extensibility, and if you want to dump Word documents into your own pre-defined XML DTD, then R2NET is worth the $69 dollars.
You could also write your own Perl RTF parser by making use of RTF::Tokenizer. I have done this too. It is a more difficult road, but gives you absolute flexibility. There may be a similar RTF tokenizer for Python???
Best regards, Bryan
Bryan R. Capitano President, CAPITANO WEb CONSULTING Tel: 541-344-0747 Email: Bryan@capitanoweb.com URL: http://www.capitanoweb.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Torped Strategi och Kommunikation AB Johan Carlsson johanc@easypublisher.com Mail: Birkagatan 9 SE-113 36 Stockholm Sweden Visit: Västmannagatan 67, Stockholm, Sweden Phone +46-(0)8-32 31 23 Fax +46-(0)8-32 31 83 Mobil +46-(0)70-558 25 24 http://www.easypublisher.com http://www.torped.se
That sound's cool. I'll check it out. I actually have found another alternative to MSWordDocument called NuxDocument which support othewr formats too. I will try this on first and see if it solves my problem. Thanks for all who helped! -- Angelo On 12/16/02 11:55 PM, "Bryan Capitano" <Bryan@capitanoweb.com> wrote:
I have not tried the MSWordDocument product before. That's interesting, thanks for sharing it. I am familiar with a commercial product from Logictran called 'R2NET'. With this software you can easily convert Word (RTF) files to HTML or XHTML or XML. I use the product extensively at the Linux command line. It is easy to use, very powerful and robust. It gives you lots of control over how documents are converted through a translation file which you can customize if you want more custom output. I think it would be easy to plug into Zope.
Bryan
--On 17 December 2002 14:56 +0800 "Jose Angelo P. Abarentos" <jiggs@mac.com> wrote:
That sound's cool. I'll check it out.
I beat you to it - it looks extremely cool. After a bit of tweaking I have r2net on Windows converting Word docs with complicated equations to HTML in one fell swoop. Putting Zope in front of it should not be a big deal (as it offers a command line interface). See <http://www.bris.ac.uk/is/projects/cms/logictran/logictran.html> and <http://www.bris.ac.uk/is/projects/cms/logictran/logictran.doc> for the original.
I actually have found another alternative to MSWordDocument called NuxDocument which support othewr formats too.
I found (about a year ago now) that wvWare-based products couldn't handle graphics/equations very well but things might have developed a bit. Certainly Open Office 1.0 was able to open the logictran.doc and _almost_ render it correctly - so NuxDocument and the OO plug-in might be able to compete with r2net with really horrid Word files.
I will try this on first and see if it solves my problem.
Let the list know how you get on .... Paul
Thanks for all who helped!
-- Angelo
On 12/16/02 11:55 PM, "Bryan Capitano" <Bryan@capitanoweb.com> wrote:
I have not tried the MSWordDocument product before. That's interesting, thanks for sharing it. I am familiar with a commercial product from Logictran called 'R2NET'. With this software you can easily convert Word (RTF) files to HTML or XHTML or XML. I use the product extensively at the Linux command line. It is easy to use, very powerful and robust. It gives you lots of control over how documents are converted through a translation file which you can customize if you want more custom output. I think it would be easy to plug into Zope.
Bryan
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- The Library, Tyndall Avenue, Univ. of Bristol, Bristol, BS8 1TJ, UK E-mail: paul.browning@bristol.ac.uk URL: http://www.bris.ac.uk/
participants (5)
-
Bryan Capitano -
danielle.d-avout -
Johan Carlsson [EasyPublisher] -
Jose Angelo P. Abarentos -
Paul Browning