On Mon, 11 Jun 2001, Chris Withers wrote:
Wow Matt, you seem to know what you're talking about :-)
My final year University project was to create an Open Source mailing list archive :) I did quite a bit of reading into information retrieval and assorted algorithms and data structures. I had a prototype running for quite some time, but is currently down as I am wiping the machine to start again in python :) The original system was a mix of C/Perl/Python and returned results in XML which then were formatted via XSLT. Once I get a spare minute I am going to try and re-implement it in Python and using ZODB (with BerkeleyDB storage) I might try and port some of the code over to work as a PluggableIndex too. One of the main tasks is to write a python wrapper around my compression code. I will have to look more closely at how to write Python modules in C, as it does lots of bit twiddling which is in a very tight loop. The object will basically be a compressed list to which you can append ascending integers and will allow various fast union/intersection operations with other similar objects. This in itself may be sufficent to use in a PlugginIndex.
If you get a chance to implement the index I asked about, please gimme a shout, I'd love to try it out...
Unfortunately I don't have the time. Unless I can use it myself directly in a project we have funding for (or unless anyone wants to fund my time to develop it) I will have to wait until I have some more time on my hands.
PS: Whereabouts in the UK are you?
Bristol. -Matt -- Matt Hamilton matth@netsight.co.uk Netsight Internet Solutions, Ltd. Business Vision on the Internet http://www.netsight.co.uk +44 (0)117 9090901 Web Hosting | Web Design | Domain Names | Co-location | DB Integration