Hi, I don't usually do this but I need your advise on something. It's a filename splitter for a KeywordIndex of File objects. http://www.peterbe.com/plog/filename-splitter By applying this splitter I hope to be able to search for files by parts of the filename. Has anything like this been done before? Am I on the right track? Do you see any pitfalls? -- Peter Bengtsson, work www.fry-it.com home www.peterbe.com hobby www.issuetrackerproduct.com
Peter Bengtsson schrieb:
Hi, I don't usually do this but I need your advise on something. It's a filename splitter for a KeywordIndex of File objects. http://www.peterbe.com/plog/filename-splitter By applying this splitter I hope to be able to search for files by parts of the filename.
Has anything like this been done before? Am I on the right track? Do you see any pitfalls?
I used a similar approach when we had to enable people to search for products based on name. So splitting on caMel and digits was done too. Additionally I had a stopword list in my case (to catch A4 P3 and so on) For filenames I guess you dont need this.
Peter Bengtsson wrote at 2005-11-15 11:47 +0000:
Hi, I don't usually do this but I need your advise on something. It's a filename splitter for a KeywordIndex of File objects. http://www.peterbe.com/plog/filename-splitter By applying this splitter I hope to be able to search for files by parts of the filename.
Are you aware, that the "PathIndex" can do this already -- especially the "Managable PathIndex" from my "ManagableIndex" product <http://www.dieter.handshake.de/pyprojects/zope> -- Dieter
Am Dienstag, den 15.11.2005, 19:26 +0100 schrieb Dieter Maurer:
Peter Bengtsson wrote at 2005-11-15 11:47 +0000:
Hi, I don't usually do this but I need your advise on something. It's a filename splitter for a KeywordIndex of File objects. http://www.peterbe.com/plog/filename-splitter By applying this splitter I hope to be able to search for files by parts of the filename.
Are you aware, that the "PathIndex" can do this already -- especially the "Managable PathIndex" from my "ManagableIndex" product
According to your documentation, it could be done with ManageableIndex, but PathIndex isnt yet there. Peter splits in the name, not just the path.
Tino Wildenhain wrote at 2005-11-15 21:07 +0100:
Am Dienstag, den 15.11.2005, 19:26 +0100 schrieb Dieter Maurer:
Peter Bengtsson wrote at 2005-11-15 11:47 +0000:
Hi, I don't usually do this but I need your advise on something. It's a filename splitter for a KeywordIndex of File objects. http://www.peterbe.com/plog/filename-splitter By applying this splitter I hope to be able to search for files by parts of the filename.
Are you aware, that the "PathIndex" can do this already -- especially the "Managable PathIndex" from my "ManagableIndex" product
According to your documentation, it could be done with ManageableIndex, but PathIndex isnt yet there. Peter splits in the name, not just the path.
Are your sure? The "Managable PathIndex" splits a string string at '/' (alternatively it takes a sequence) and indexes the result in a way to support quite efficiently subpath searches. Thus, it is applicable, if "filename-splitting" means splitting at "/' (or you use some arbitrary splitting beforehand) *AND* (more importantly) if one is interested in subpath queries. If the second condition is not given, as KeywordIndex may do as well (again a "Managable KeywordIndex" would allow the splitting to be formulated as a normalizer). -- Dieter
participants (3)
-
Dieter Maurer -
Peter Bengtsson -
Tino Wildenhain