[Zope-dev] The Mysterious Key Error and Unrelated Searc Result Bug

Michel Pelletier michel@digicool.com
Wed, 05 Apr 2000 01:19:45 -0700


Wow, what a mystery, you ready for this one?

In lib/python/SearchIndex/Lexicon.py the set method of the Lexicon
class, change:

        else:
            self._lexicon[intern(word)] = self.counter
            self.counter = self.counter + 1
            return self.counter

to

        else:
            self.counter = self.counter + 1
            self._lexicon[intern(word)] = self.counter
            return self.counter

<groans from the audience>

Yes, simple programmer error folks.  Funny thing is, someone pointed
this out to me months ago and it's been fixed in the CVS the whole time,
I just never put two and two together on the key error.  So good news is
CVS users are vaccinated against this bug!  And yes, the rest of you
have suffered senselessly this whole time.

It was a bugger to track down though since when new documents are
defined, new words come in groups (and all at first), so that their
integers that they map to are close enough to each other to make the
result *look* valid, it was only after a lot of unindexing and
reindexing did the false word ids drift apart far enough for sporadic
'key' errors to pop up depending what search you did.

Unfortunatly patching your 2.1 is more elaborate than just switching
those two lines, you have to repair the damaged lexicon.  Bad thing is
there is no part of the Lexicon interface that lets you clear it.  So
you have to add that.  This involves adding one method and patching
another.  First, make the first change above, and *in that same file*
add this method to the Lexicon class:

    def clear(self):
        self._lexicon = OIBTree()

Then, in lib/python/SearchIndex/UnTextIndex.py in the UnTextIndex class
in the clear() method, add a call to clear in that method, like this::

    def clear(self):
        self._lexicon.clear()
        self._index = IOBTree()
        self._unindex = IOBTree()

Now update your catalog.  You'll get a brand new spanking catalog that
will NEVER key error on you again.

If you don't know python and you don't know what you're doing, don't do
this.  Yes I'll release a patch for 2.1.6 and the fix is allready in the
CVS.  I need some confirmation on the fix first.  There are at least a
few people out there who won't be driven crazy by this bug anymore.

Kudos go to Darran Edmundson for reproducing the bug in one step for
me.  Someone else made the connection that this also causes the
Mysterious Unrelated Search Result bug but I can't open my mailbox right
now to who to give that credit.

-Michel