On Thu, Jul 14, 2011 at 7:03 PM, Tres Seaver <tseaver@palladion.com> wrote:
A further micro-optimization is to pre-allocate a big list in a thread local. Running the attached script produces::
$ python perftest.py Via insert_0: 24.8401839733 Via append: 15.6596238613 speedup: 37.0% Via tuple_add: 21.9555268288 speedup: 11.6% Via prealloc: 10.5278339386 speedup: 57.6%
Thanks for that test! I've run it with a much more reasonable test data set from one of our live databases, for example: letters = ['', 'nordic', 'en', 'nordic-council', 'organisation-and-structure', 'committees', 'welfare-committee', 'meetings-and-meeting-documents', 'meeting-of-the-welfare-committee-31-march-2011-stockholm'] which produces quite a different result: Via insert_0: 4.35216093063 Via append: 4.80827713013 speedup: -10.5% Via tuple_add: 1.73557782173 speedup: 60.1% Via prealloc: 2.17882084846 speedup: 49.9% If I run it with a smaller segment, like: letters = ['', 'nordic', 'en', 'nordic-council'] the effect is even clearer: Via insert_0: 1.88715004921 Via append: 2.91810202599 speedup: -54.6% Via tuple_add: 0.809303998947 speedup: 57.1% Via prealloc: 1.56584882736 speedup: 17.0% Looks like the tuple add is faster until you hit very long sequences. But even if you had such a nested ZODB structure, the majority of lookups would still happen for the much shorter sequences. So I think the tuple add is the winner for this case. Hanno