[Zope] DocumentLibrary FileConverter / pdf.py problem (write to /tmp)

sean.upton@uniontrib.com sean.upton@uniontrib.com
Sat, 02 Jun 2001 16:57:09 -0700


I wanted to say, so far, DocumentLibrary seems to be a wonderful product -
very useful, organized, and looks very easy to customize - I'm impressed.  I
am, however, having a bit of a problem...

I installed DocumentLibrary on Zope 2.3.2 the other day.  I have a problem
with the PDF conversion, though I have verified the pdftotext works properly
on all my files.   I commented out in pdf.py the code to delete the file in
/tmp, so I could do a diff on the output with the original PDF - but I
didn't need to: the file sizes were radically different:

www:/tmp# ls -l *pdf*
-rw-r--r--    1 www-data www-data   123512 Jun  1 20:34
pdf_tmp_991452845.246
-rw-r--r--    1 root     root       320120 Jun  2 16:26 text.pdf

...where text.pdf is a copy of the original.  Hexdump output shows these
files to be identical, just that the pdf being dumped out of Zope on to the
fs seems to be truncated abruptly.  I've used several PDFs generated from
different sources. 

My config is the binary package install of Zope on Debian using the 2.3.2-2
deb under Debian unstable/intel, using nothing fancy in the setup...
wvWare/word converter works fine, I tested several differnt packages and
source installs for xpdf, and all worked fine (this was before I realized
that this wasn't a pdftotext issue)...

www-date (the user Zope runs as on Debian) has write privileges to /tmp, and
the file starts to write out there part way; what might prevent the pdf.py
code from fully writing the file to /tmp from the ODB?

Any thoughts might be appreciated...

Sean

P.S.  I am interested in possibly contributing effort to write / integrate
other sorts of document/file converters into DocumentLibrary. Specifically,
I might be interested in adding the ability to treat raster images (JPG,
etc) as documents, with the ability to use the binary IPTC caption data (the
caption you can add to an image in photoshop) as the full text...

=========================
Sean Upton
Senior Programmer/Analyst
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
sean.upton@uniontrib.com
=========================