[Zope-dev] SVN: Zope/branches/2.13/ fix `LazyMap` to avoid unnecessary function calls when not accessing items in order (fixes http://dev.plone.org/plone/ticket/9018)

Tres Seaver tseaver at palladion.com
Tue Dec 14 12:11:57 EST 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/14/2010 09:12 AM, Andreas Zeidler wrote:
> Log message for revision 118863:
> fix `LazyMap` to avoid unnecessary function calls when not accessing
> items in order (fixes http://dev.plone.org/plone/ticket/9018)

First, thanks for looking into this issue.

> Changed:
>   U   Zope/branches/2.13/doc/CHANGES.rst
>   U   Zope/branches/2.13/src/Products/ZCatalog/Lazy.py
>   U   Zope/branches/2.13/src/Products/ZCatalog/tests/test_lazy.py
> 
> -=-
> Modified: Zope/branches/2.13/doc/CHANGES.rst
> ===================================================================
> --- Zope/branches/2.13/doc/CHANGES.rst	2010-12-14 13:24:24 UTC (rev 118862)
> +++ Zope/branches/2.13/doc/CHANGES.rst	2010-12-14 14:12:11 UTC (rev 118863)
> @@ -11,6 +11,8 @@
>  Bugs Fixed
>  ++++++++++
>  
> +- Fix `LazyMap` to avoid unnecessary function calls.
> +
>  - LP 686664: WebDAV Lock Manager ZMI view wasn't accessible.
>  
>  2.13.1 (2010-12-07)
> 
> Modified: Zope/branches/2.13/src/Products/ZCatalog/Lazy.py
> ===================================================================
> --- Zope/branches/2.13/src/Products/ZCatalog/Lazy.py	2010-12-14 13:24:24 UTC (rev 118862)
> +++ Zope/branches/2.13/src/Products/ZCatalog/Lazy.py	2010-12-14 14:12:11 UTC (rev 118863)
> @@ -145,44 +145,28 @@
>      # Don't access data until necessary
>  
>      def __init__(self, func, seq, length=None):
> -        self._seq = seq
> -        self._data = []
> -        self._func = func
> +        self._seq=seq
> +        self._func=func
>          if length is not None:
> -            self._len = length
> +            self._len=length

Please don't un-PEP8 a cleaned-up module:  leave spaces around
operators.  If that was unintentional (maybe you pasted from a copy of a
file made before the module was cleaned up?), then you still need to
review the diff and edit out unintended changes first.

>          else:
>              self._len = len(seq)
> +        self._marker = object()
> +        self._data = [self._marker] * self._len

Have you measured the impact of pre-allocating here for very large
sequences?  You are trading off space in return for speed for the
(relatively) rare case of random-access indexing off-the end.

I'm *sure* that is a not a good trade-off for all cases:  the catalog is
often used for queries which return result sets in the order of 10^5 or
more objects, and *very* rarely is it accessed randomly (I had never
seen it before reading the bug entry you cite).  The normal case is to
take the first few entries (batch_size * batch_number) from the huge set.

I think you would be better off finding a way to inject your style of
LazyMap into the catalog for the random-access case, e.g., by passing
'_lazy_map' into the methods you are using.

BTW, another interesting alternative impplementation might be to use a
dictionary instead of a list for 'data'.  You could then avoid any
pre-allocation at all.

> -    def __getitem__(self, index):
> -        data = self._data
> +    def __getitem__(self,index):
> +        data=self._data
>          try:
> -            s = self._seq
> +            s=self._seq
>          except AttributeError:
>              return data[index]

While we're at it, the 'try: ... except:'  here is wasteful and slow.
Lazy objects aren't persistent, and the '_seq' attribute is added
unconditionally in '__init__'.

> -        i = index
> -        if i < 0:
> -            i = len(self) + i
> -        if i < 0:
> -            raise IndexError(index)
> +        value = data[index]
> +        if value is self._marker:
> +            value = data[index] = self._func(s[index])
> +        return value
> -        ind = len(data)
> -        if i < ind:
> -            return data[i]
> -        ind = ind - 1
>  
> -        func = self._func
> -        while i > ind:
> -            try:
> -                ind = ind + 1
> -                data.append(func(s[ind]))
> -            except IndexError:
> -                del self._func
> -                del self._seq
> -                raise IndexError(index)
> -        return data[i]
> -
> -

Finally, whatever cleanup we settle on also needs to be forward-ported
to the trunk.


Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0HpV0ACgkQ+gerLs4ltQ4FsQCg2zpfEn+Hih6lNBqboQJDDBl5
7cIAn1rnqKtXsBv7j/s+PlcK0Nals7m1
=b/uG
-----END PGP SIGNATURE-----



More information about the Zope-Dev mailing list