[ZODB-Dev] zeo.memcache

Wed Oct 12 17:53:19 UTC 2011

On 10/09/2011 08:26 AM, Jim Fulton wrote:
> On Sat, Oct 8, 2011 at 4:34 PM, Shane Hathaway<shane at hathawaymix.org>  wrote:
>> On 10/05/2011 11:40 AM, Pedro Ferreira wrote:
>>> Hello all,
>>>
>>> While doing some googling on ZEO + memcache I came across this:
>>>
>>> https://github.com/eleddy/zeo.memcache
>>>
>>> Has anybody ever tried it?
>>
>> Having implemented memcache integration for RelStorage, I now know what
>> it takes to make a decent connection between memcache and ZODB.  The
>> code at the link above does not look sufficient to me.
>>
>> I could adapt the cache code in RelStorage for ZEO.  I don't think it
>> would be very difficult.  How many people would be interested in such a
>> thing?
>
> This would be of broad interest!
>
> Can you briefly describe the strategy?  How do you arrange that
> the client sees a consistent view of the current tid for a given
> oid?

(Sorry for not replying sooner--I've been busy.)

As I see it, a cache of this type can take 2 basic approaches: it can 
either store {oid: (state, tid)}, or it can store {(oid, tid): (state, 
last_tid)}. The former approach is much simpler, but since memcache has 
no transaction guarantees whatsoever, it would lead to consistency 
errors. The latter approach makes it possible to avoid all consistency 
errors even with memcache, but it requires interesting algorithms to 
make efficient use of the cache. I chose the latter.

Given the choice to structure the cache as {(oid, tid): (state, 
last_tid)}, a simple way to use the cache would be to get the last 
committed tid from the database and use that tid for the lookup key. 
This would be extremely efficient until the next commit, at which point 
the entire cache would become irrelevant and would have to be rebuilt.

Therefore, most of the interesting parts of the cache code in RelStorage 
are focused on simply choosing a good tid for the cache lookup operation.

It caches the following things in memcache:

1. A pair of checkpoints.
2. A state and last committed transaction ID for a given transaction ID 
and object ID.
3. A commit counter.

The checkpoints are two arbitrary committed transaction IDs.  Clients 
can use any pair of committed transaction IDs as checkpoints (so it's OK 
if the checkpoints disappear from the cache), but the cache is much more 
efficient if all clients use the same checkpoints.

Each storage object holds a pair of "delta" mappings, where each delta 
contains {oid: tid}. The deltas contain information about what objects 
have changed since the checkpoints: delta0 lists the changes since 
checkpoint0 and delta1 lists the changes between checkpoint1 and 
checkpoint0. Within each transaction, the delta0 mapping must be updated 
before reading from the database.

When retrieving an object, the cache tries to discover the object's 
current tid by looking first in delta0.  If it's there, then the cache 
asks memcache for the object state at that exact tid.  If not, the cache 
asks memcache for the object state and tid at the current checkpoints.

It is not actually necessary to have 2 checkpoints.  It could work with 
more checkpoints or only 1 checkpoint, but if there were only 1, each 
checkpoint shift would be equivalent to flushing the cache.  With more 
checkpoints, the cache would often query many keys for each read 
operation.  2 checkpoints seems like a good balance.

I wrote more notes about the caching strategy here:

http://svn.zope.org/relstorage/trunk/notes/caching.txt

As I review all of this, I wonder at the moment why I chose to create 
delta1.  It seems like the system would work without it.  I probably 
added it because I thought it would improve cache efficiency, but today 
I'd rather simplify as much as possible even at the cost of a little 
theoretical efficiency.

The commit counter is not very related, but since I brought it up, I'll 
explain it briefly: it serves as a way for clients to discover whether 
the database has changed without actually reading anything from the 
database.  It is a counter rather than a transaction ID because that 
choice avoids a race condition.

Shane