[Grok-dev] Re: Sorting catalog results
Kevin Smith
kevin at mcweekly.com
Wed Jun 27 12:04:39 EDT 2007
Hi Luciano,
We are rebuilding a newspaper site built on Plone that is having scaling
issues with the full-text index right now. We're too small to manage the
problem by always throwing hardware at it.
As a scaling strategy for the new site, we are using divide and conquer.
Since it makes sense for us to sort from newest to oldest, we are using
one catalog for each year. The archives are from 1998, we publish 20
articles between 1-3k words plus images per week, on a commodity
machine, and have only started experiencing scaling issues in the last
year.
So Martin's getting-things-done-strategy will most likely work unless
you know you're going to have a very large database.
Assuming it's necessary, if you are going to need many different kinds
of sorts you may need to come up with more ingenious ways of dividing
the content.
Other things we are doing to help mitigate scaling issues.
* pgstorage to avoid the excessive memory used by FileStorage and to
flatten startup time (directorystorage also has similar benefits)
* OpenVZ virtual servers to throttle various usages of the site
(seperate text-index searching from browsing from search engine crawls)
HTH,
Kevin Smith
Martijn Faassen wrote:
> Luciano Ramalho wrote:
>> Trying to figure out how to sort results from a catalog search, I just
>> read (most of) a very long thread on Zope3-dev earlier this year and
>> got worried about how it ended.
>>
>> Does it mean that I have to educate my users ("You don't really want
>> sorted results, on account of scalability problems"), or is there some
>> recommended way to sort the results of a catalog search in Zope3 or
>> Grok?
>
> No, please don't educate your users. The debate went back and forth. I
> think we all agree that the sorting story could be scaled better, but
> Jim kept pointing out there is no fundamental way to speed it up, and
> I kept pointing out that besides the fundamentals there may be many
> things we can do to make this scale better nonetheless. We got stuck
> in a loop there. :)
>
> There are two strategies here. One is the short-term
> getting-things-done-strategy. For that, I'd recommend using Python's
> (or zc.table's, if you're using that for tabular display) sorting
> functionality. That sorts the whole result set. It may scale well
> enough for your application.
>
> Now on to the other strategy. Ignas has done some work on the
> SchoolTool project concerning scalable sorting and batching that may
> be relevant here and reported to me that he managed to speed things
> quite a lot. I don't know the details, but here are pointers to the
> code he gave me a while ago:
>
> http://source.schooltool.org/trac/browser/trunk/schooltool/src/schooltool/skin/table.py
>
> - FilterWidget and TableFormatter classes
>
> http://source.schooltool.org/trac/browser/trunk/schooltool/src/schooltool/skin/table.py
>
> - TableContainerView class
>
> http://source.schooltool.org/trac/browser/trunk/schooltool/src/schooltool/skin/templates/table_container.pt
>
>
> But please talk to Ignas (on irc, or ignas.mikalajunas at gmail.com) for
> more information.
>
> If this code is interesting, we have a problem, as Schooltool code is
> GPL. Generalizing it and putting it in Zope's svn is thus blocked. We
> could go two routes:
>
> * contact Mark Shuttleworth as the Zope Foundation and ask whether we
> can get this code as ZPL in the Zope repository. I can start this
> process if needed - let me know.
>
> * talk to Ignas to get the general idea, study the code, and
> reimplement the concepts as a Zope 3 package without copying the code.
>
> Regards,
>
> Martijn
>
> _______________________________________________
> Grok-dev mailing list
> Grok-dev at zope.org
> http://mail.zope.org/mailman/listinfo/grok-dev
>
More information about the Grok-dev
mailing list