[ZODB-Dev] Indexing and dates/times
Pedro Ferreira
jose.pedro.ferreira at cern.ch
Tue Jul 13 04:35:07 EDT 2010
Hello,
>> I am currently trying to devise a way to index and retrieve some
>> millions of objects according to their modification date/time. One of
>> the problems I'm facing is that of index "granularity": I'd like to
>> provide "to the second" granularity,
>>
> will there ever be more than item with the same key?
>
Exactly, that's the problem.
>> but for that I need some structure
>> that lets me do that. So, the options I see are:
>> - A timestamp-based
>>
> What do you mean by "timestamp"
>
Well, it could be a UNIX timestamp.
>> BTree index - looks highly inefficient, as there
>> will be many entries with only one element (probably almost all of
>> them),
>>
> I have no idea what you mean by this.
>
That's the problem you've already mentioned above.
So, in a relational DB i would do something like:
SELECT * FROM table WHERE timestamp >= X AND timestamp <= Y
Since I cannot do this with ZODB, I'd have to have a BTree, indexed by
timestamp... however, as you said, if I want "to the second"
granularity, I will rarely have two items with the same key (which makes
it pretty useless).
So, I was wondering if there is some data structure I can use for this,
as this seems to be a pretty common use case.
The first thing that comes to my mind is a tree with different levels -
i.e year, month,day, hour, minute... with the leaves being sets of items.
Thanks!
Pedro
--
José Pedro Ferreira
Indico Team
IT-UDS-AVC
513-R-0042
CERN, Geneva, Switzerland
More information about the ZODB-Dev
mailing list