[also reported to collector, but this breaks textindex Catalogs totally, so I thought it was also worth cc'ing to the list] The current CVS version of ZCatalog has a couple of nasty bugs in it. First off, the standard nonglobbing Lexicon uses a variable 'self.counter', without ever initialising it. Oops. Secondly, Catalog handles the default Lexicon case inconsistently. In the case of a lexicon being provided, it stores the name as self.lexicon, otherwise it stores a Lexicon _object_ as self.lexicon. Later on, it tries to do a getattr with the lexicon object as the second arg. The first bug utterly breaks non-globbing Lexicons. The second bug breaks textindexes that don't provide a lexicon at creation time. A patch for both follows. Note that the patch tries to handle gracefully existing Catalogs and Lexicons that might be broken. The complete fix would also handle the case of the Lexicon being created in a saner way - maybe always store the name... ? If the email system munges this patch, it's also at http://www.zope.org/Members/anthony/patches/zcatalog-lexicon.diff Thanks, Anthony Index: Lexicon.py =================================================================== RCS file: /cvs-repository/Zope2/lib/python/SearchIndex/Lexicon.py,v retrieving revision 1.8 diff -u -r1.8 Lexicon.py --- Lexicon.py 2000/02/01 18:40:20 1.8 +++ Lexicon.py 2000/02/19 08:55:16 @@ -115,6 +115,7 @@ def __init__(self): self._lexicon = OIBTree() + self.counter = 0 def set(self, word): """ return the word id of 'word' """ @@ -123,6 +124,8 @@ return self._lexicon[word] else: + if not hasattr(self, 'counter'): + self.counter = 0 self._lexicon[intern(word)] = self.counter self.counter = self.counter + 1 return self.counter Index: UnTextIndex.py =================================================================== RCS file: /cvs-repository/Zope2/lib/python/SearchIndex/UnTextIndex.py,v retrieving revision 1.19 diff -u -r1.19 UnTextIndex.py --- UnTextIndex.py 2000/01/31 19:25:35 1.19 +++ UnTextIndex.py 2000/02/19 08:55:16 @@ -172,7 +172,10 @@ Zope. I don't think indexes were ever intended to participate in this way, but I don't see too much of a problem with it. """ - vocab = getattr(self, vocab_id) + if type(vocab_id) is not type(""): + vocab = vocab_id + else: + vocab = getattr(self, vocab_id) return vocab.lexicon -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
Anthony Baxter wrote:
[also reported to collector, but this breaks textindex Catalogs totally, so I thought it was also worth cc'ing to the list]
The current CVS version of ZCatalog has a couple of nasty bugs in it.
First off, the standard nonglobbing Lexicon uses a variable 'self.counter', without ever initialising it. Oops.
Aye Carumba.
Secondly, Catalog handles the default Lexicon case inconsistently. In the case of a lexicon being provided, it stores the name as self.lexicon, otherwise it stores a Lexicon _object_ as self.lexicon. Later on, it tries to do a getattr with the lexicon object as the second arg.
Yep, the intention is to be stored by name (so it can be acquired) the old was was to have the object stored in the text index, which is no longer the case for a fresh checkout but will break old catalogs that upgrade to the new code. Your patch fixes the symptom, but not the problem which was me not providing a __setstate__ method to convert old catalogs to the new way.
The first bug utterly breaks non-globbing Lexicons. The second bug breaks textindexes that don't provide a lexicon at creation time.
Yep.
A patch for both follows. Note that the patch tries to handle gracefully existing Catalogs and Lexicons that might be broken.
I've added counter as a class attribute to Lexicon, this should fix bug #1 and take care of lexicons that might not have a counter attribute.
The complete fix would also handle the case of the Lexicon being created in a saner way - maybe always store the name... ?
Yep, I'm currently scratching my head over a setstate that will fix this, it's a bit tricky because now lexicon objects are contained in Vocabularies which are Zope managable objects that are acquired by indexes. This means I have to take the old object and wedge it somewhere in Zope to where it can be acquired (probably in the catalog that contained the index that contained the lexicon). I'm sort of adverse to doing such a complex operation in __setstate__, Jim, do you this this would be a bad idea? (to add an object to Zope from another object's __setstate__) Thanks Anthony, -Michel
participants (2)
-
Anthony Baxter -
Michel Pelletier