Hiya, Am 27.01.2012, 12:56 Uhr, schrieb Jan-Carel Brand <lists@opkode.com>:
Hang on a minute! While I'm not 100 % convinced of the need in the core I think a separate package just for TreeVocabulary would be splitting hairs. If z3c.form can use it then I think that is justification enough.
Justification enough to put it in zope.schema?
Yes.
From the zope.schema 4.0.0 release notes: "Port to Python 3. This adds a dependency on six and removes support for Python 2.5"
Shouldn't that also mean a preference of "for key in _dict" over "for key in _dict.keys()" ? Though you might prefer .items() in _getPathToTreeNode() i.e def _getPathToTreeNode(self, tree, node) path = [] for parent, child in tree.items(): if node == parent.value: return [node] path = self._getPathToTreeNode(child, node) if path: path.insert(0, parent.value) break return path
So if we want to use OrderedDict (which is from Python >= 2.7), we just need to bridge the Python 2.6 version with the ordereddict package: http://pypi.python.org/pypi/ordereddict/1.1 This would introduce a new dependency for zope.schema on Python 2.6, I don't know if that's acceptable or not.
I think it's perfectly justified in this case and similar to what has happened with other libraries like ElementTree in the past that have made life easier and subsequently been adopted by the standard library.
In setup.py one could specify the extra dependency only for Python < 2.7:
import sys
REQUIRES = ['setuptools', 'zope.interface >= 3.6.0', 'zope.event', 'six', ]
if sys.version_info < (2 , 7): REQUIRES += ['ordereddict'],
setup( # [...] install_requires=REQUIRES, # [...]
Yep.
Back to bike-shedding. As I was intrigued by the whole thing I've spent some time looking at the code. I'm not too happy on the use of nested functions as I find they obscure code, particularly when used recursively. I think that "createTree" and "recurse" should be written as separately testable generators.
Ok, I've refactored the nested methods out into class or instance methods. I however don't see how one could use a generator for the recursive methods that return dictionaries. With regards to the "recurse" method (now called _getPathToTreeNode), I don't see how one could use a generator in a more efficient manner than the normal recursion that's being used currently. I played around with it and the best I could come up with is this: def generator(_dict, value, path=[]): if value in _dict.keys(): yield path+[value] for key in _dict.keys(): for path in recurse(_dict[key], value, path+[key]): if value in path: yield path You still have to recurse through the different branches until you find the node that you are looking for and you still need to store the path in a list. So what would be the added value? What's more, the generator returns a list within a list, like so: [['Regions', 'Germany', 'Bavaria']], which I find clunky.
You're probably right. I think I was getting ahead of myself wondering about possible issues with deeply nested vocabularies. Any real improvement would probably involve a node index. I notice that the examples do not allow for departments in regions to have the same name as the region (common enough in some countries) so you could simply add the index keyed by node value with the path as the value when the class is created. This should be possible by calling _getPathToTreeNode during one of the passes through _flattenTree. getTermPath would then just need to do a lookup on this.
I also don't see a need for createTerm in this particular class and the subsequent awkward call from createTree. As it stands it is copy & paste both in method and test.
The reason for the "createTerm" method in the SimpleVocabulary is to allow people to override it to provide support for other term types. This also applies to the TreeVocabulary.
I think this is an example of excessive pluggability. createTerm isn't in an interface and I've never come across the need to overwrite it. I would just drop it from the implementation. Problem solved. fromDict() seems to be the only class method required.
If you must have it with the same implementation e createTree = SimpleVocabulary.createTree
does the job just fine I guess you mean "createTerm" ? I'm not convinced that this is better. This creates a dependency on a method from another class that's not being subclassed.
What if createTerm is changed in the SimpleVocabulary in such a way that doesn't take the TreeVocabulary into account?
That would be the case with any subclass. In terms of maintenance it would be easier to spot in case the changed behaviour was desirable.
but I don't see the advantage of cls.createTerm(*args) over SimpleTerm(*args) See above. "createTerm" is there to let developers override it and provide their own term objects.
Do you have a concrete use case for this? Remember that createTerm is a convenience method only. Frankly, I don't see the need for it in what is a fairly specialised class.
As I said this is bike-shedding but I think our source code should be written with a view to being read and probably copied verbatim. With that in mind I prefer readability and testability over integration. So why then cannot I copy "createTerm" from SimpleVocabulary?
For exactly that reason: just because someone writing application might copy & paste your code is should be reason enough to make the code as clean as possible and that does mean DRY.
In the end it tends to make things easier to use. The exceptions where refactoring to produce slightly uglier code but with significant performance hopefully prove the rule.
Well, your suggestions concerning the nested methods had me thinking and in the end resulted in significant refactoring. I think it was worth it though, so thanks.
I've enclosed a diff with proposed changes. Charlie -- Charlie Clark Managing Director Clark Consulting & Research German Office Kronenstr. 27a Düsseldorf D- 40217 Tel: +49-211-600-3657 Mobile: +49-178-782-6226