[ZODB-Dev] Changing the pickle protocol?
Hanno Schlichting
hanno at hannosch.eu
Wed Apr 28 11:43:42 EDT 2010
On Wed, Apr 28, 2010 at 5:11 PM, Jim Fulton <jim at zope.com> wrote:
> Do you know of specific benefits you expect from protocol 2? Any
> specific reasons
> you think it would be better in practice?
I have just seen some ongoing work on pickles in recent times, for
example from the Python 2.7 what's new:
- The pickle and cPickle modules now automatically intern the strings
used for attribute names, reducing memory usage of the objects
resulting from unpickling. (Contributed by Jake McGuire; issue 5084.)
- The cPickle module now special-cases dictionaries, nearly halving
the time required to pickle them. (Contributed by Collin Winter; issue
5670.)
Unless I've misread the code, these changes only apply to protocol
two. And then there's the old claims of pep 307 stating that pickling
new-style classes would be more efficient.
Finally Python 3 introduces pickle protocol version 3, which deals
explicitly with the new bytes type. There's more changes in Python 3
and the pickle format, so that's a separate project. But it suggested
to me, that the pickle format isn't quite as "dead" anymore as it used
to be.
> I've avoided going to protocol 2 for two reasons:
>
> - It wasn't clear we'd get a benefit without deeper changes.
> Those deeper changed might be of value, but only if we're
> careful about how we make them.
>
> In particular, we could replace class names in pickles
> if we has a registry mapping ints to class names.
> This could provide a number of benefits beyond
> smaller pickles, but it needs some thought to get right.
Right. I'm not particular interested in the pickle class registry.
Having a hard dependency between code filling the registry and the
actual data has all sorts of implications. I don't really want to go
there myself.
> - I want zope.xmlpickle to work with ZODB database records and
> it doesn't support protocol 2 yet. This doesn't have to block
> moving to protocol 2, but I really would like to have this work
> if possible.
Ok. I know there's some tools reading the zodb data on their own,
without actually using the API's. I wouldn't want to break them, if
there's no clear benefit.
> I'm skeptical that there would be enough benefit for protocol 2 without
> implementing a registry to take advantage of integer pickle codes.
>
> The other benefit of protocol 2 has to do with the way instance pickles are
> constructed and, for persistent objects, ZODB takes a very different
> approach anyway.
>
> I suggest doing some realistic experiments to look at the impact of the
> change.
>
> - Convert an interesting Zope 2 database from protocol 1 to protocol 2.
> How does this affect database size?
>
> - Do some sort of write and read benchmarks using the 2 protocols to
> see if there's a meaningful benefit.
Ok, thanks. That gives me enough direction to work on some specific benchmarks.
Hanno
More information about the ZODB-Dev
mailing list