[ZODB-Dev] Backing up a ZODB

Wed Oct 29 22:31:58 EST 2003

[Tim Peters]
>>>> That little status byte is first set to 'c', then the new
>>>> transaction is appended, and then it seeks back to the location of
>>>> the status byte and overwrites it with a blank.  That's the way in
>>>> which filestorage isn't changed solely by appending.

[Neil Schemenauer]
>>> Interesting.  Is that done because the size of the transaction is
>>> not known before it is written?

[Tim]
>> We'd have to channel Jim to know why it was done this way to begin
>> with.

[Jeremy Hylton]
> We do it that way to implement two-phase commit.  The FileStorage
> can't finish the first phase until it can guarantee that it can
> commit the data.  Our implementation of "guarantee" is that all the
> bytes are written and flushed before tpc_vote() returns.  It's still
> possible for the transaction to abort at this point.  If there is a
> catastrophic failure after tpc_vote(), we don't want to leave the
> Data.fs with a valid transaction record for a transaction that never
> committed.

That doesn't explain why Neil's mental model is inappropriate:

>>>    <transaction size><transaction data><status byte>...

That is, his question was why we don't append the status byte, rather than
writing it first then seeking back to overwrite it.  If catastrophe ensues,
filestorage would either truncate the .fs file (which it can do now too), or
leave a partial record at the end of the .fs file (which it can also do now,
in case of e.g. sudden power failure).  So what does writing the status byte
first buy us that couldn't be gotten by writing it last?

Of course it seems academic (it's too late to change, and a safe incremental
backup strategy would still need to guard against copying an incomplete
transaction at the end of the file).