Indexing dtml/html files
Hi, Is it possible to use catalog to index ordinary html pages intelligent ? If I have a html(dtml) file like: <HTML> <p id=title> Are you Intelligent ? </p> <p id=author> A. Einstein </p> </HTML> I would like to index the title and the author id's, for which I could search. Is it possible ?? Regards ------------------------------------------------- Anders Holmbech Nielsen | Tlf: (+45) 70 22 56 00 Software Engineer | Fax: (+45) 70 22 57 00 Integrator Uniware A/S | http:/www.integrator.dk
on Thursday, March 09, 2000 Anders Holmbech Nielsen wrote : AHN> Hi, AHN> Is it possible to use catalog to index ordinary html pages intelligent ? AHN> If I have a html(dtml) file like: AHN> <HTML> AHN> <p id=title> AHN> Are you Intelligent ? AHN> </p> AHN> <p id=author> AHN> A. Einstein AHN> </p> AHN> </HTML> AHN> I would like to index the title and the author id's, for which I could search. AHN> Is it possible ?? I guess you could do it in XML-documents, that are inherently more machine-readable than html.. Otherwise, storing title and author in properties would do the trick for zcatalog..Not exactly what you ask for, but as close as i can get with my skills.. I guess you could write a pythonmethod to extract title and author from the html and store them in properties so that they could be easily indexed.. -- Geir B Hansen web-developer/designer geirh@funcom.com http://www.funcom.com
on Thursday, March 09, 2000 Anders Holmbech Nielsen wrote : AHN> Hi,
AHN> Is it possible to use catalog to index ordinary html pages intelligent ?
AHN> If I have a html(dtml) file like:
AHN> <HTML> AHN> <p id=title> AHN> Are you Intelligent ? AHN> </p> AHN> <p id=author> AHN> A. Einstein AHN> </p> AHN> </HTML>
AHN> I would like to index the title and the author id's, for which I could search.
AHN> Is it possible ??
I guess you could do it in XML-documents, that are inherently more machine-readable than html.. Otherwise, storing title and author in properties would do the trick for zcatalog..Not exactly what you ask for, but as close as i can get with my skills.. I guess you could write a pythonmethod to extract title and author from the html and store them in properties so that they could be easily indexed..
actually I do have xml documents also but it is for an automatic system where the upload is done via webdav or ftp. So I need a way to upload Zope xmldocuments from a client. I am not a Python programmer and would rather find a solution without doing some coding. But I have found the urls to find and update the index from the client. But now I need to upload xml documents to Zope AS xml documents. Is there an easy way to this ?
-- Geir B Hansen web-developer/designer geirh@funcom.com http://www.funcom.com
Regards ------------------------------------------------- Anders Holmbech Nielsen | Tlf: (+45) 70 22 56 00 Software Engineer | Fax: (+45) 70 22 57 00 Integrator Uniware A/S | http:/www.integrator.dk
Anders Holmbech Nielsen wrote:
Hi,
Is it possible to use catalog to index ordinary html pages intelligent ?
If I have a html(dtml) file like:
<HTML> <p id=title> Are you Intelligent ? </p> <p id=author> A. Einstein </p> </HTML>
I would like to index the title and the author id's, for which I could search.
Is it possible ??
Yes, the Catalog can be used programatically from DTML and Python. You will have to implement your own parser (look at the standard htmllib in python) that feeds the parsed attributes into the Catalog. Does it do it out of the box? No. -Michel
participants (3)
-
Anders Holmbech Nielsen -
Geir B Hansen -
Michel Pelletier