[ZODB-Dev] Cache warm up time

Sat Mar 9 10:50:29 UTC 2013

Le Friday 08 March 2013 18:50:09, Laurence Rowe a écrit :
> It would be great if there was a way to advise ZODB in advance that
> certain objects would be required so it could fetch multiple object
> states in a single request to the storage server.

+1

I can see this used to process a large tree, objects being be processed as 
they are loaded (loadds being pipelined).

Pseudo-code interface suggestion:

class IPipelinedStorage:
  def loadMany(oid_list, callback, tid=None, before_tid=None):
  callback being along the lines of:
    def callback(oid, data_record, tid, next_tid):
      if stop_condition:
        raise ... (StopIteration ? just anything ?)
      return more_oids_to_queue_for_loading
  tid and before_tid (mutualy exclusive) specify the snapshot to use, to
  implement equivalent of loadSerial and loadBefore.

class IPipelinedConnection:
  def walk(ob, callback):
  callback being along the lines of:
    def callback(just_loaded_object, referee_list):
      # do womething on just_loaded_object
      return filtered_referee_list
  referee_list would expose at least referee's class (name ?), and hold their
  oid for Connection.walk internal use (only ?).
  Or maybe just ghosts, but callback would have to take care of not
  unghostifying them - it would void the purpose of pipelining loads.

Above ZODB (persistent containers with internal persistent objects, like 
BTree):
  Implement an iterator over subobjects ignoring intermediate internal
  structure (think BTree.*Bucket classes).

Specific iteration order could probably be specified to be able to implement 
iterkeys and such in BTree for example, but storage may have to implement load 
reordering when they happen in parallel (like NEO, and as could probably be 
implemented for zeoraid and relStorage configured with multiple mirrored 
databases), limiting latency/processing parallelism and possibly leading to 
memory footprint explosion.
So I think it should be possible to also request no special loading order to 
get lowest latency backend can provide and somewhat constant memory footprint.

Any thought/comment ?
-- 
Vincent Pelletier