DocumentLibrary FileConverter / pdf.py problem (write to /tmp)
I wanted to say, so far, DocumentLibrary seems to be a wonderful product - very useful, organized, and looks very easy to customize - I'm impressed. I am, however, having a bit of a problem... I installed DocumentLibrary on Zope 2.3.2 the other day. I have a problem with the PDF conversion, though I have verified the pdftotext works properly on all my files. I commented out in pdf.py the code to delete the file in /tmp, so I could do a diff on the output with the original PDF - but I didn't need to: the file sizes were radically different: www:/tmp# ls -l *pdf* -rw-r--r-- 1 www-data www-data 123512 Jun 1 20:34 pdf_tmp_991452845.246 -rw-r--r-- 1 root root 320120 Jun 2 16:26 text.pdf ...where text.pdf is a copy of the original. Hexdump output shows these files to be identical, just that the pdf being dumped out of Zope on to the fs seems to be truncated abruptly. I've used several PDFs generated from different sources. My config is the binary package install of Zope on Debian using the 2.3.2-2 deb under Debian unstable/intel, using nothing fancy in the setup... wvWare/word converter works fine, I tested several differnt packages and source installs for xpdf, and all worked fine (this was before I realized that this wasn't a pdftotext issue)... www-date (the user Zope runs as on Debian) has write privileges to /tmp, and the file starts to write out there part way; what might prevent the pdf.py code from fully writing the file to /tmp from the ODB? Any thoughts might be appreciated... Sean P.S. I am interested in possibly contributing effort to write / integrate other sorts of document/file converters into DocumentLibrary. Specifically, I might be interested in adding the ability to treat raster images (JPG, etc) as documents, with the ability to use the binary IPTC caption data (the caption you can add to an image in photoshop) as the full text... ========================= Sean Upton Senior Programmer/Analyst SignOnSanDiego.com The San Diego Union-Tribune 619.718.5241 sean.upton@uniontrib.com =========================
participants (1)
-
sean.upton@uniontrib.com