[Zope3-Users] getting random results out of a catalogs field index
Dominique Lederer
dominique.lederer at inode.at
Sat May 5 13:27:10 EDT 2007
Christian Theune wrote:
> Am Samstag, den 05.05.2007, 17:42 +0200 schrieb Dominique Lederer:
>> hi
>>
>> i would like to retrieve a number of *random* entries out of a catalogs field index.
>>
>> i tried it with first getting the catalogindex-length an then accessing a
>> randomized list-index, but this is very slow, because of the large number of
>> entries in the index.
>>
>> do you know any better solution?
>
> I'm kind of guessing here.
>
> You say you are:
>
> - querying the catalog
> - accessing a random index from the result set
> - noticing that this is slow
>
> Does this only happen if the index is very large, e.g. you're retrieving
> an element from the end of the result set?
>
> I don't know exactly how the result sets are organized, but this
> behaviour would imply that loading a later element triggers something
> like loading the earlier elements too. I can't really imagine that.
>
> I think the general problem that this is slow lies in the fact that
> randomly selecting elements means
>
> a) you need access to the full list of things
> b) applying a sort
>
> Sorting has a complexity of at least O(n log n) which becomes slow
> enough for large sets that it's noticable.
>
> BTW: How large is large?
>
> Christian
>
hi, thanks for the reply, i just managed to improve the performance of my query
significantly:
what i wanted to do was:
- retrieve the len() of the catalog index
- retrieve a list() of the Resultset
- accessing n random results and their objects
to retrieve a random object i did:
query = catalog.apply({'myIndex':(None,None)})
length = len(query)
index_intids = list(query)
intid = all[random.randint(0,len_all-1)]
object = getObject(intid)
which was with 10000 items in the index slow (i had to wait 2-3 seconds for a
view to render)
after looking into the field index implementation i changed the above lines to:
length = len(catalog['myIndex']._rev_index)
index_intids = list(catalog['myIndex']._rev_index.keys())
which now works like a charm.
i am not an expert with BTrees so i cant really say what the problem is/was.
Dominique
More information about the Zope3-users
mailing list