Re: [Zope-dev] TreeVocabulary in zope.schema.vocabulary

29 Jan 2012

      Hiya,

Am 27.01.2012, 12:56 Uhr, schrieb Jan-Carel Brand <lists@opkode.com>:
...
...
Hang on a minute! While I'm not 100 % convinced of the need in the core  
I
think a separate package just for TreeVocabulary would be splitting  
hairs.
If z3c.form can use it then I think that is justification enough.
...
Justification enough to put it in zope.schema?
Yes.
...
From the zope.schema 4.0.0 release notes:
"Port to Python 3. This adds a dependency on six and removes support for
Python 2.5"
Shouldn't that also mean a preference of
"for key in _dict" over "for key in _dict.keys()" ?

Though you might prefer .items() in _getPathToTreeNode()

i.e
    def _getPathToTreeNode(self, tree, node)

         path = []
         for parent, child in tree.items():
             if node == parent.value:
                 return [node]
             path = self._getPathToTreeNode(child, node)
             if path:
                 path.insert(0, parent.value)
                 break
         return path
...
So if we want to use OrderedDict (which is from Python >= 2.7), we just
need to bridge the Python 2.6 version with the ordereddict package:
http://pypi.python.org/pypi/ordereddict/1.1
This would introduce a new dependency for zope.schema on Python 2.6, I
don't know if that's acceptable or not.
I think it's perfectly justified in this case and similar to what has  
happened with other libraries like ElementTree in the past that have made  
life easier and subsequently been adopted by the standard library.
...
In setup.py one could specify the extra dependency only for Python <
2.7:
...
import sys
REQUIRES = ['setuptools',
      'zope.interface >= 3.6.0',
      'zope.event',
      'six',
     ]
if sys.version_info < (2 , 7):
    REQUIRES += ['ordereddict'],
setup(
# [...]
    install_requires=REQUIRES,
# [...]
Yep.
...
...
Back to bike-shedding. As I was intrigued by the whole thing I've spent
some time looking at the code. I'm not too happy on the use of nested
functions as I find they obscure code, particularly when used  
recursively.
I think that "createTree" and "recurse" should be written as separately
testable generators.
...
Ok, I've refactored the nested methods out into class or instance
methods.
I however don't see how one could use a generator for the recursive
methods that return dictionaries.
With regards to the "recurse" method (now called _getPathToTreeNode), I
don't see how one could use a generator in a more efficient manner than
the normal recursion that's being used currently.
I played around with it and the best I could come up with is this:
       def generator(_dict, value, path=[]):
            if value in _dict.keys():
                yield path+[value]
           for key in _dict.keys():
                for path in recurse(_dict[key], value, path+[key]):
                    if value in path:
                        yield path
You still have to recurse through the different branches until you find
the node that you are looking for and you still need to store the path
in a list. So what would be the added value?
What's more, the generator returns a list within a list, like so:
[['Regions', 'Germany', 'Bavaria']], which I find clunky.
You're probably right. I think I was getting ahead of myself wondering  
about possible issues with deeply nested vocabularies. Any real  
improvement would probably involve a node index. I notice that the  
examples do not allow for departments in regions to have the same name as  
the region (common enough in some countries) so you could simply add the  
index keyed by node value with the path as the value when the class is  
created. This should be possible by calling _getPathToTreeNode during one  
of the passes through _flattenTree. getTermPath would then just need to do  
a lookup on this.
...
...
I also don't see a need for createTerm in this
particular class and the subsequent awkward call from createTree. As it
stands it is copy & paste both in method and test.
...
The reason for the "createTerm" method in the SimpleVocabulary is to
allow people to override it to provide support for other term types.
This also applies to the TreeVocabulary.
I think this is an example of excessive pluggability. createTerm isn't in  
an interface and I've never come across the need to overwrite it. I would  
just drop it from the implementation. Problem solved. fromDict() seems to  
be the only class method required.
...
...
If you must have it with the same implementation
e
createTree = SimpleVocabulary.createTree
...
does the job just fine
I guess you mean "createTerm" ?
I'm not convinced that this is better.
This creates a dependency on a method from another class that's not
being subclassed.
...
What if createTerm is changed in the SimpleVocabulary in such a way that
doesn't take the TreeVocabulary into account?
That would be the case with any subclass. In terms of maintenance it would  
be easier to spot in case the changed behaviour was desirable.
...
...
but I don't see the advantage of
cls.createTerm(*args) over SimpleTerm(*args)
See above. "createTerm" is there to let developers override it and
provide their own term objects.
Do you have a concrete use case for this? Remember that createTerm is a  
convenience method only. Frankly, I don't see the need for it in what is a  
fairly specialised class.
...
...
As I said this is bike-shedding but I think our source code should be
written with a view to being read and probably copied verbatim.
With that in mind I prefer readability and testability over integration.
So why then cannot I copy "createTerm" from SimpleVocabulary?
For exactly that reason: just because someone writing application might  
copy & paste your code is should be reason enough to make the code as  
clean as possible and that does mean DRY.
...
...
In the end it tends to make things easier to use. The exceptions where  
refactoring to
produce slightly uglier code but with significant performance hopefully
prove the rule.
...
Well, your suggestions concerning the nested methods had me thinking and
in the end resulted in significant refactoring. I think it was worth it
though, so thanks.
I've enclosed a diff with proposed changes.

Charlie
-- 
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226