[Zope3-Users] zodb objects backup

Mon Feb 27 10:42:23 EST 2006

On Feb 26, 2006, at 4:33 PM, Shaun Cutts wrote:

> Thanks Gary!
>
> Ah -- very nice: so Data.fs *is* a transaction log. In theory an RDBMS
> with write ahead logging is still more secure because the transaction
> log is only backup, and the rest of the database is another copy of  
> the
> current state (though not with undo capability).
>
> But with replication, this issue is taken care of. (Too bad  
> replication
> isn't part of the core functionality....)

Some people use ZRS, some people have strategies with  
DirectoryStorage, some people use repozo as described in the first  
link I sent, some are exploring other options like PGStorage.

> Also nice is
> http://www.python.org/workshops/2000-01/proceedings/papers/fulton/ 
> zodb3.
> html#pgfId=294502
> Section 3.1... so ZODB is effectively doing MVCCS and with per-object
> locks to resolve conflicts.

That paper is old: the ZODB is doing MVCC now with full views of the  
database at the time of transaction start.  There's a doc in the wiki  
describing it.

> (Question: can one explicitly lock an object
> without changing it? I guess just setting _p_changed?)

That will mark it as changed whether or not it was, yes.  I'm pretty  
sure (but notice caveat) that this will "dirty" the object, as far as  
write conflicts are concerned, whether or not the object actually  
changed.

> Are there any benchmarks available?

I believe there is a ZODB bench somewhere.  I don't know much about it.

>
> We can't abandon Postgres entirely:
>   1) we have custom aggregate statistical functions in C
>   2) we have to allow third-party ODBC access to certain views
>   3) general lack of query language potentially problematic for
> datamining

Especially for third parties (non-Zope/ZODB experts).  Two  
"howevers": first, I'm led to believe that datamining generally  
happens externally from apps anyway, so the Postgres slave idea (that  
you have below) would work quite well. Second, even in a ZODB app, as  
with an SQL app, if you know where the data resides, know the  
available indexes, know how to build and populate new indexes, and  
know how to intersect and union results, you can do just about  
anything you want.  The difference is just that "everyone" knows SQL  
spelling for that stuff, and much fewer know the nitty-gritty of  
spelling that with Zope 3 indexes.

> But 1)-2)-3) for us are "read-only" needs, so in theory, with
> replication, we could use Postgres as a slave to ZODB master.

Yes, I've considered an architecture like that recently myself for  
some projects.  Other approaches are to use the Zope database  
adapters (which handle the transaction machinery), and then write  
simple wrappers that produce throw-away, non-persistent objects that  
persist the data in Postgres.  Another would be to monetarily support  
someone like Shane to see if a solution like the PG storage will help.

I would not encourage (and, perhaps too gently, have not encouraged)  
someone without either a lot of ZODB knowledge or a lot of time and  
energy to become a very deep ZODB expert to pursue the __getstate__  
__setstate__ approach you showed.  It's an interesting idea, but you  
are really bypassing huge chunks of the ZODB machinery, probably to  
your loss.  Much safer to deal with the Zope DBA stuff (persistent  
data in transient objects) or, if you are an expert or want to be  
one, with an approach like Shane's.

> Again, benchmarks would be nice. We haven't yet speced out, let alone
> bought, the hardware for our production system, so I couldn't yet say
> how high the bar is.

I don't have these, and I'm not even sure exactly what you want.

Gary