[Zope-dev] why external version indexes don't fulfill all use cases for development

Stephan Richter srichter at cosmos.phy.tufts.edu
Sun Nov 11 18:02:53 EST 2007


On Sunday 11 November 2007, Martijn Faassen wrote:
> What KGS solves is that it allows the ongoing development and testing of
> an integrated Zope 3.  That is, there's a Zope 3 'trunk' of versions
> that keeps being updated as there are bugfix releases. I'm not sure what
> happens as soon as someone wants to make a new feature release of any
> package. Presumably they end up in KGS too.

Absolutely not! Like Linux distributions, there will be a KGS for every Zope 3 
release. I have already requested a new directory called "zope-dev" where new 
feature releases can be tested.

> After all, we won't have a 
> single Zope 3.4 and then a single Zope 3.5 for which we can create a new
> KGS.

Yes, we will. Why do you think the current KGS is called "zope3.4"? If you 
want to have a different working set, then you are free to create one, but 
don't expect much support from the community when things are not working as 
expected.

> We intend to let packages move at different feature-release speeds, 
> and we can't have a KGS for each package.

You do not need to have a single KGS for every package. But believing that we 
can just randomly make new feature releases that work with the rest of the 
world is naive at best. We have seen already what happens, if everyone uses 
their own set of versions and packages.

A development KGS will be used to test new feature releases.

> What KGS doesn't have is history.

Yes, it does. Why do you think I manage the "controlled-packages.cfg" file in 
SVN? And in SVN, I do not create branches and tags without a reason.

> When I release an application or 
> framework and I used KGS to make sure that all my versions were correct,
> it will work on the day of release. As soon as enough bugfix (or
> feature) releases make it to KGS, something will inevitably break. We've
> seen innocuous changes breaking code a lot of times, so we can't pretend
> that never happens. It *will* happen.

I agree. Have you read the discussion we had yesterday on the zope-dev mailing 
list? We discussed the problem and possible solutions already.

Here are a couple of choices we have to avoid the problem:

1. During development I would recommend to use the index of the latest stable 
release; or if you are brave, you can use the development KGS. (Of course, 
you can also use the versions block of a particular release, though you will 
miss out on bug fixes, which I think is less optimal.)

2. Once release/deployment time comes around, you lock the versions. There is 
a wide range of possibilities:

  (a) You download the "version.cfg" of the KGS at this time and maintain it  
      in your deployment code.

  (b) You point to a particular release's "versions.cfg", for  
      example "versions-3.4.0.cfg". (I will start producing those starting 
      with the next Zope 3.4 release. Maybe I should create a versioned file 
      for "controlled-packages.cfg" as well?!)

  (c) You use a particular SVN revision, download `zope.release` yourself and 
      generate the "versions.cfg" file, which is trivial. I already create  
      tags for releases there.

  I probably would prefer option (b).

> This breaks a fundamental assumption for releases. When I release
> something, I expect it to work tomorrow, next month, and next year.

I agree. The KGS should be seen as a branch. Particular versions 
of "versions.cfg" and maybe "controlled-packages.cfg" should be considered 
releases.

> With code, we know that history, and branches, and so on, are important.
> We use Subversion. With KGS we only have an ongoing trunk.

No, as I said before, the KGS specification, which 
is "controlled-packages.cfg", is maintained in SVN as well.

> With Grok, we use an external versions list. We can use this to solve
> the above problem. We basically take snapshots of what is in KGS. This
> allows us to maintain some history, though it isn't ideal either, as
> it's quite a bit of overhead.

How is this overhead?

> If I build an application or framework on top of Grok, I will need to
> maintain yet another external list for the extra packages of this
> application, fixing those versions.

Why? I don't follow that?

> We could probably even use the 
> extends feature of buildout to have this list point at Grok's list so we
> have to repeat ourselves less should we want to build something on top
> of *that* application or framework again.

I don't understand what you are saying. However, I'll note that the KGS is 
also extendable. For example, Grok can maintain its 
own "controlled-packages.cfg" that extends a particular Zope 
3 "controlled-packages.cfg". Extending also means that you have the choice of 
overwriting a particular version requirement. (I have implemented this after 
the discussion yesterday.) Having a "controlled-packages.cfg" does *not* mean 
you need an index. This file can be used to generate a "versions.cfg" file or 
just a `[versions]` section for buildout.

> So, while annoying, that is somewhat manageable. Now imagine I want to
> use a completely separate Python library with my Grok application. This
> python library has dependencies itself again. This means I will need to
> know about versions of those dependencies as well, and fix them into my
> application's list.

Yes. I see this as an advantage. Version specifications in `setup.py` usually 
contain ranges of allowed versions. What happens if one release in the range 
does not work? Then you make false promises. The only way to avoid this would 
be by specifying all allowed versions exactly, which makes no sense.

> There are some fundamental problems with external lists or indexes:
>
> * we need to know about the dependency of dependencies, even if we never
> use them directly. Information hiding is broken.

This is a requirement not a problem statement. I don't understand "Information 
hiding is broken." I cannot see how "information hiding" could be a good 
thing here.

> * a single list will never do it. We intend to have many different
> applications that may depend on different versions of packages. Grok may
> need a newer zope.publication than your application does. A Grok
> extension may need an even newer version than Grok does. We'll be baking
> endless amounts of lists this way.

I never claimed that we will have just one list.

Your assumption is that there will be an endless amount of working sets and 
that they are easy to find. While theoretically possible, this will not 
happen. I think the events of the past two months have shown that the 
opposite happens and it is actually hard to find working sets, once 
development digresses too much.

It is pretty easy to maintain a list for a particular application, especially 
with having already base lists, like zope3.4 and grok, being available. 

> If this information is inside the packages itself, the history will be
> automatically maintained with Subversion and existing releases. History
> therefore works: if I install Grok 0.11, I would get all dependencies of
> Grok 0.11 automatically without having to worry about external indexes.

This is the overly simplistic world view that we had about two months ago. 
Because we are all developing on top of different stacks and people expect to 
pick and choose, having a common foundation is the only possible way.

Let me give you a dire scenario.

Let's say you have a package A-1.0.0. You also have a package B-1.0.0 that 
depends on A. You suggest fixing versions, so you would write in the dep list 
of B: 'A => 1.0.0'. You now release a new feature version of A, A-1.1.0, that 
is incompatible with B-1.0.0. So package B-1.0.0 will be broken until you 
release B-1.0.1 that states 'A >= 1.0.0 and A <= 1.0.99'. 

This problem here is that you have to re-release B only for the sake of 
changing the version requirement. This is not so bad if you have one known 
package to do this with. But in Zope's case we often 20, 30, 60, or even 80 
packages that now have to be re-released. All this work for one update. I can 
tell you that releasing this many packages is a very tedious job and very 
error-prone. I have just spent 3-4 man-weeks releasing about 120 packages, 
some multiple times.

But this is not your biggest problem. You cannot even assume that you can know 
the full list of packages that need updating, because they are not public or 
you are not aware of them.

And the worst about it all is that everyone will be blocked until the new 
releases are all out _[1], unless they do some internal version nailing for 
their application. Which brings us back to a KGS. Now, you could have simply 
skipped specifying versions in 'setup.py' and get the same effect.

Specifying versions in the release makes you believe you are safe. But what 
you are really doing is to impose one underspecified working set onto 
everyone. But this working set is not universal; in fact it is very small, 
since it does not consider a larger dependency network. The Zope 3 KGS 
attempts to be a universal set for Zope 3, which other projects can built 
upon.

.. [1] This did actually happened during the FoilageSprint. Remember all the 
outrage?

> Information hiding works: if I use foo 1.3 and foo 1.3 knows it needs
> bar 1.7, it'll simply get that and I don't have to know about it. I
> don't even need to worry about the *existence* of bar.

With KGS, you do not need to know about it either. One assumption you make is 
that bar 1.7 can be found in the default index (PyPI) or in any of the 
dependency links locations. In the KGS we simply change this to use a 
different index with the promise that only versions of bar are available that 
actually work, which might be bar 1.6, bar 1.7, or bar 1.8.

> People have been saying that since Linux distributions use external
> indexes, we should too, as we are dealing with the same problem as Linux
> distributions.

It is true that our problem is very much like a Linux distribution, at least 
my mind.

However, Jim and I did not choose the index approach because of Linux. We 
simply thought about the easiest way to create a known-working-set that would 
require the least amount of software to be written. So Jim had the idea of 
simply limiting the amount of available versions of a package and noted that 
it would be a fairly simple (now about 50 lines of code) to extend his ppix 
tool to do that. Actually, the KGS is really only 
the "controlled-packages.cfg" file and the `http://download.zope.org/zope3.4` 
index is just one way to use the KGS. Other usages include the "versions" 
section generator, the test buildout generator, and the Zope 3 tree updater.

> While the problem is similar, I think the nature of 
> development makes our problems, and therefore our solutions, quite
> different from the way distributions do it.
>
> How are we different?

That's a good question. I think you forgot the most important one.

Manpower! In a Linux distribution, there are many contributors to just the 
release process. In the Python/Zope world, the developer is the release 
person too. Thus Linux distributions have the luxury that they can create a 
new release for every package for every Linux distribution release.

One promise Jim and others made to me about eggs was that once we had a stable 
set (which I worked on for the current KGS), releasing would become much 
simpler, because many packages would not need new releases for very long 
stretches at a time. With your suggestion, packages will need to be released 
all the time.

> We have many, many different small distributions (package +
> dependencies) that can be combined. We have such a small distribution
> for each application. We have such a small distribution for each
> extension. Not just that. We have such a small distribution for each
> *release* of an application. We have such a small distribution for each
> *release* of an extension.

I think your proposal makes this a problem, whereby the KGS provides a 
solution; of course, in combination with the versions and find-links options.

> I therefore still believe that version dependency information should
> move out of external indexes and into packages.

- 1 googol

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training


More information about the Zope-Dev mailing list