Michael Bernstein wrote:
John Eikenberry wrote:
I was looking into the same issues recently, but for a much smaller set of data (50000ish). In my tests ZPatterns/binary-trees scaled well for storage and retrieval. But ZCatalog did not. It was basically useless for partial matching searches (taking many minutes for searches that retrieved more than 100 matches)
Was this true even for cases where the batch size was smaller than 100? For example, if a search returns over 100 results but the batch size is only 20 (so that only 20 results at a time are displayed), do you still get the performance hit?
Short answer: yes Long answer: If you check out the source and/or hit it with the profiler you'll see that the way the partial search works is to first do a more general search then to limit the hits as much as possible via regex's. Both these steps have to happen no matter the batch size, and this is where you take the performance hit.
[snip] I ended up deciding to go with a RDBMS backend for data storage with a ZPatterns interface. SkinScripts work so well for this that I'm actually glad I switched. It simplified my design and implementation immensely.
So you're saying that you are doing all searching using SQL statements, and not just object retreival and storage, correct? How are you handling full text searches?
Yes. I'll use MySQL's built in pattern matching facilities. It can do full text searches with partial matching, and it can do this fast. I'm working on a system that will return the DataSkin's in responce to the query. Allowing me to deal with just the objects yet use all of MySQL's facilities. I'v just started to work on this as part of a larger project, but I'm doing it full time and should have something fairly soon. My company is very free software friendly, so I'll be able to share it once its ready. If you happen to be interested. -- John Eikenberry [jae@zhar.net - http://zhar.net] ______________________________________________________________ "A society that will trade a little liberty for a little order will deserve neither and lose both." --B. Franklin