[ZODB-Dev] [OT] NoSQL
Shane Hathaway
shane at hathawaymix.org
Fri Nov 13 15:33:10 EST 2009
Stephan Richter wrote:
> On Friday 13 November 2009, Roché Compaan wrote:
>> We had such an opportunity about 2 years ago and although the client
>> never reached (and probably will never) reach the membership they
>> dreamed about, they did pay us to develop a storage for members that
>> could scale to more than a 100 million members. We implemented a data
>> partitioning strategy at application level. If I had another shot at it,
>> I would try and develop a distributed ZODB storage, because it would be
>> a lot simpler compared to what we had to do at application level.
>
> Note that Shane developed a sharding solution a year ago with me. It provides
> container-level partitioning.
>
> http://svn.zope.org/z3c.sharding/trunk
Thanks for the reminder. :-)
> This in combination with the encryption work that we did for the ZODB makes
> the ZODB actually be a lot more advanced than some of the new comers.
>
> I am very intrigued now to setup an EC2 cluster and install a z3c.sharding
> based solution demonstrating 100M users with some data. Mmmh...
I've been studying how to build an enormous database based on what I
know. There are an incredible number of distributed databases these
days, but all of them concern me in one way or another. I'm wondering
if ZODB might actually have a fighting chance in the distributed
database realm. With z3c.sharding or something like it, I think I would
set things up as follows:
- In-memory ZODB caches would probably be pointlessly painful at that
scale, so I would set the ZODB cache size for all partitions to 0. A
cache size of 0 allows ZODB to cache for the duration of a request, but
flushes all objects out of the cache at transaction boundaries.
- With the cache size set to 0, we can disable cache invalidation, which
will probably be a major win.
- I would rely heavily on memcached to provide the pickles. I would try
to use the cache checkpointing algorithm I recently added to RelStorage.
- I would aim to read or write only a small number of objects per
request from partitions.
Shane
More information about the ZODB-Dev
mailing list