[Zope3-Users] Re: ZCML, practicality, purity (was "Excellent perspective...")

Thu Dec 22 18:48:56 EST 2005

On 12/22/05, Shane Hathaway <shane at hathawaymix.org> wrote:
> [Shane]
> >>Are you sure ZCML is The Right Way?  I know its purpose (since I helped
> >>invent Zope 3): to combine configurations by multiple developers without
> >>imposing a particular workflow.  However, I maintain that Python code could
> >>do the job better.  The Python code I have in mind is not the same as
> >>Jeffrey's examples.  I'll elaborate if there's interest.
>
> [Pete Taylor]
>  > *ping* for interest in elaboration on the code you have in mind...
>
> Ok.  Here are two snippets that express the same thing.  First in ZCML:
>
>    <browser:skin
>        name="Rotterdam"
>        interface="zope.app.rotterdam.Rotterdam" />
>
>    <browser:resource
>        name="zope3.css"
>        file="zope3.css"
>        layer="zope.app.rotterdam.rotterdam" />
>
> Now in Python (hypothetical):
>
>      from zope.app import rotterdam
>
>      def configure(context):
>          context.browser.skin(
>              name='Rotterdam',
>              interface=rotterdam.Rotterdam)
>          context.browser.resource(
>              name='zope3.css',
>              file='zope3.css',
>              layer=rotterdam.rotterdam)
>
> All functionality and capabilities of ZCML are retained, but there are
> important, subtle differences.
>
> - I could conceivably type configuration directives at the interactive
> Python prompt.  I could use the standard dir() and help() functions to
> find out what directives exist and how to use them.

This reminds me a lot of the initialize(context) in Zope 2 products. I
still have very mixed feelings about that system, most of it due to
lack of knowledge or understanding on my part of how to write a good
'initialize()' function, and what should take place in there
(configuration) versus elsewhere in that __init__.py module.

One thing that I like about ZCML is control over order, especially in
regards to configuration. What order things are initialized in,
regardless of alphabetical order. So if I have a lot of packages that
depend on 'hurry.file' being configured, I can ensure that it happens
before my files. This was a problem that I had with Formulator in Zope
2. If I had my own widgets defined in my own product, the behavior of
registration was a little bit different depending on whether my
Product's name came before or after Formulator alphabetically.

I also _really_ like how well documented ZCML is - especially in
comparison to the rest of Zope 3. APIDOC captures a lot of things, but
the ZCML menu is by far the most useful and usable. There are still
too many unknown interfaces and APIs inside of Zope. It's no fault of
APIDOC. There are a lot of interfaces out there, and sorting and
presenting them in a cogent way is difficult.

ZCML, on the other hand,  is a limited vocabulary, and in a tool like
APIDOC I've found it to be more discoverable. That is EXTREMELY
important to me. dir() and help() at the Python prompt are nice. But
so is schema-driven apidoc reference material - especially if that
schema remains used as / close to what's used to transform incoming
data. People don't always spell out parameters in doc strings clearly.
Interface schema fields, on the other hand, give a great view about
what's required, what's not, and what format it should be in.

Other problems that I've had with Python based configuration in the
past has involved not knowing when to do the configuration. Do I want
to register the class I just defined in the module code?

class Foo(...):
    pass
registerAsFooable(Foo)

Well, that tends to lead to problems, since it takes away from being
able to use Foo without also having to carry its registration which
may not be wanted or required by a client of Foo.

These are the sorts of things that one has to define and restrict
early on, because once a bad example gets out there in the wild and
other people start copying it.

Webware had 'configure.py' files and I *HATED* seeing them. They
looked sloppy and weird, stuffing things into dictionaries whose
structures I didn't know or understand. But if that were a route one
decided to use, one would have to lay down VERY strict rules.
Otherwise we lose all the benefits of the Component Architecture and
start heading back into a free-for-all mess.

Such rules might be: no other package or module code should import a
configuration module. Only the configuration machinery should ever do
it. With Zope 2, I never was able to write and keep a good personal
rule about how to deal with a Product's __init__.py module, and it
would be involved with doing things besides configuration. Modules and
objects that were imported in the top of the __init__.py module
probably should have been imported inside initialize(context).

Only and only and only and only 'configuration' logic should be in a
configuration module.

The module HAS to provide and only provide an
IComponentArchitectureConfiguration interface, and should export only
the names in that interface. Alarms and warnings should go off
otherwise.

I just believe - heavily - after many of my Zope 2 experiences that
configuration as done by ZCML should be as separate from the code
itself as possible. If it's going to be in the same programming
language, it needs to be made clear what it is, what can be done, and
what can NOT be done.

I'm all for ZCML doing less. There's too much magic, in my opinion,
that goes on in the browser package. I remember trying to figure out
why I couldn't easily supply a replacement template for my editforms
(I had a template in a common package that I wanted many other
components to use - yet despite a long-ago wish for a way to refer to
such templates and relative to their package paths, I don't think it's
ever gone in). Even if I supplied a special 'class' in the editform
directive and had that class override the template attribute used in
the base editform class, the ZCML directive handling code still
overrode it. It took me a long time to even understand what the
directive code was doing and how it was doing it.

But I still like the configuration being very separate from the
'component code'. I know, without a doubt, that just importing a
Python package or module is not going to suddenly register or
re-register components. ZCML based configuration keeps things
separate, and it keeps the likelihood of unwanted side effects down to
zero or near zero.

> - If I want to register a lot of similar things, in ZCML I have to
> either repeat myself, leading to poor maintainability, or create new
> directives, leading to directive proliferation.  In Python I can use
> variables, loops, functions, etc., reusing skills I already know.

Programmatic configuration isn't always the most readable thing
either. I understand very little about Makefiles anymore. It's been a
long time since I've used them heavily. Yet I still find many of them
easier to understand than many setup.py distutils files (most setup.py
files are fine. But some start pulling off too many crazy tricks and
even though I'm extremely comfortable with Python, I walk away from
those in fear).

> - If I want to debug a registration, I can use pdb or any other Python
> debugging tool.
>
> - Code snippets can include both the code and the default configuration
> (yet users are not forced to use the configuration), making code samples
> clearer.

And that's where the danger of proliferation - by intent or accident -
of unwanted configuration happening can occur. I've lived through that
problem too many times in the past when trying to write lightweight
sortof-component-architecture-like systems for Zope 2.

I admit that I was also having to invent everything on the fly and
under a deadline, so the whole 'register a thing right after declaring
it' option often seemed easiest at the moment... Until I added a new
module to the system and it got loaded automatically before the others
and I couldn't figure out why certain registrations were disappearing
and so on.

Side effects. I don't want them.

> Those are the technical arguments.  There is also the marketing argument
> that a lot of the target audience has been burned by XML, but I don't
> think that's the right basis for making a decision.  I sincerely believe
> Python code would be better than XML for the technical reasons I listed
> above.

I think that is a valid marketing argument. And I say, again, that I'm
not the biggest ZCML fan. But I've been burned _way_ too many times by
Python based configuration. For a small system (ie, one not running
the full zope.app package), ZCML is overkill. But for even moderately
sized systems, it's been a blessing having them separate. How our
components get registered and configured is reliable and predictable.

I think you could achieve this with Python, but you'd need do document
the hell out of it and put in a smart and restrictive system that
would ensure that sloppy configuration didn't happen, that the
'configure.py' module was unused by anything but the configuration
system, and that the package it was in could be loaded as a Python
package/module without the 'configure.py' module (which would have the
most requirements on the configuration of the outside system) be
ignored or otherwise not "blow up" when trying to register against
something not defined.

At that point, you're putting a lot of restrictions on Python that may
also be uncomfortable to people. But I think (just based on personal
experience) that that's the only way to make it work.

Also, I'd like to point out that Python is not the best language for
housing a lot of complex configuration data. Perl and Ruby tend to be
a lot easier to use here, as the language kindof makes it easier to
write fake little mini-languages since they're a lot more free-form
syntactically. I'm sure there are others here that remember
maintaining nested tuples of tuples of tuples for Zope things like
__ac_permissions__. Those were not always easy to read or maintain.

ZCML has the benefits of using zope.schema to define and format the
fields used in configuration and turn them into meaningful objects.
The ability to use local dotted names: <foo for=".interfaces.IPony">
<far class="..zoo.Butterstick">. The ability to see those things
_clearly_ defined in APIDOC because of their schema-ness. If I need to
know whether I can pass only one interface reference in to a
particular ZCML attribute or if I can pass in multiple, I can actually
get that information and it's based on the schema, but still pass it
in as just a string in XML. How would that work in Python? Would it be
the responsibility of the 'configure.py' maintainer to use tools to
expand things? Would the configuration machinery used try to be smart
about what's passed in to allow for both:

from interfaces import IThis, IThat
configure.registerView(for=IThis, adapts=(IThat,
'..interfaces.IOther')). Would that be allowed?

Configuration's not easy when wanting to allow a big system to all be
used together via the sort of loose coupling that Zope 3 allows.
Staying within the native programming language has benefits, but also
may be a dangerous path to shortcuts that then cut down on the sort of
re-use and extendible / replaceable connection points that I've been
able to benefit from over the past few months of Zope 3.x based
development.

I'm not saying this is impossible to do in Python. These are just the
risks and issues that I've been thinking of, and have thought of for
the past couple of years. We've developed a custom framework for
dealing with data flow and transformation for a couple of 'enterprise'
level customers, and XML is used heavily to wire all of these
transforms together. The XML files can get HUGE and not always easy to
maintain. But the job that they perform is so wildly different than
what the Python application and component code is doing that, well, it
made sense to separate them. (Plus we had the special requirement to
be able to reload that configuration file and all that it specifies.
Clearing out a big registry and then filling it up again based on
those configuration settings is a lot easier than refreshing a big
system of Python code).

Let people have their Rails. Let them grow their applications. Let
them see what happens when it gets big and the whole "programming by
convention" thing starts to break down because conventions get
forgotten or you put a text file in one of their magic directories
that causes the whole system to blow up because it shares the name
with some forgotten convention that Rails (or its clones) looks for.
We've been through that. I think that the ZCML "situation" could be
improved with:

* simpler use - let Python code say what it adapts and implements. Let
Python code subclass from BrowserView. Use ZCML to just register and
name the object. Promote this in documentation, advocacy articles, and
so on.

* alternate syntax? Not Python, but maybe something python-"ish" but
geared towards entering the kind of data references that one has to
type a lot in configuration.

* cut down on the magics like dynamic class creation. this was a
frustrating surprise when I first encountered it a couple of months
ago.

* for many of the core ZCML configuration directives, explain their
Python alternative. Not to promote its use when writing large systems,
shared toolkits or frameworks, but to show how to test or just to use
adapters and utilities in small applications that don't require the
full Zope toolkit.

I mentioned that I liked the ZCML documentation. It's great for
finding out what directives are available and what their options are.
But it's still pretty poor at explaining what is really going on
behind the scenes. That may be more advanced documentation for some
cases - but it could cut down on some users frustration and surprises.

OK. This has been long and rambling. I blame the christmas lunch cocktails. :)

--
Jeff Shell