RE: [Zope] Serving and Searching PDF's and Word Docs
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I have a client who wants to distribute learning materials via the web and are looking to move their PDF's and Word docs to a database . Has anyone used Zope for this ? Not exactly this but I store documents (word, excel, project) in a revision control system. I have written a python function that checks out and returns documents. But this only works with Merant's PVCS VM and searching within these documents in not needed...
Would you use a third party database ? Relational databases have some problems with these data types. They are usually stored as binary large objects (BLOBs). This results in some impacts on SQL. You cannot use most of the SQL features for projection and such sort of thing. Maybe you should look for a specialized database system. The problem will then be to get an appropriate database adaptor for Zope...
What about searching the documents ? As mentioned above normal RDBMS will propably have some restrictions. I cannot image how to search a document within an RDBMS for a special word or phrase.
Thanks in advance
Richard
HTH Andreas -----BEGIN PGP SIGNATURE----- Version: PGPfreeware 6.5.3 for non-commercial use <http://www.pgp.com> iQA/AwUBORFL55rEH3uwEF1BEQIBuwCgsW5HPvfWAZ8sMeo8iu+H64Ox2h0An1Wx Ng4EBdtWssZoXcYIeouCoNhl =ddbO -----END PGP SIGNATURE----- ********************************************************************** This email message has been swept by MIMEsweeper for the presence of computer viruses. Francotyp-Postalia AG & Co.
What about searching the documents ? As mentioned above normal RDBMS will propably have some restrictions. I cannot image how to search a document within an RDBMS for a special word or phrase.
Well, if the data is stored in a text or BLOB field then you can do a SELECT * FROM TABLE1 WHERE X IS LIKE '%searchterm%'. There ARE problems with BLOB's in most RDBMS' but you could : 1. Store you raw data (the document) as a Binary Large Object (BLOB). 2. Store a set of separate records which are the search terms on your objects : tblDocs ID DOC 1 <FILE INSERTED> 2 <FILE INSERTED> tblTerms TERM ID marketing 1 sales 1 germany 1 software 2 microsoft 2 halloween 2 There's a lot of redundancy in this method but it can be done. - Ian Sparks.
participants (2)
-
a.wacknitz@francotyp.com -
Ian Sparks