[Zope-dev] why external version indexes don't fulfill all use
cases for development
Stephan Richter
srichter at cosmos.phy.tufts.edu
Sun Nov 11 18:02:53 EST 2007
On Sunday 11 November 2007, Martijn Faassen wrote:
> What KGS solves is that it allows the ongoing development and testing of
> an integrated Zope 3. That is, there's a Zope 3 'trunk' of versions
> that keeps being updated as there are bugfix releases. I'm not sure what
> happens as soon as someone wants to make a new feature release of any
> package. Presumably they end up in KGS too.
Absolutely not! Like Linux distributions, there will be a KGS for every Zope 3
release. I have already requested a new directory called "zope-dev" where new
feature releases can be tested.
> After all, we won't have a
> single Zope 3.4 and then a single Zope 3.5 for which we can create a new
> KGS.
Yes, we will. Why do you think the current KGS is called "zope3.4"? If you
want to have a different working set, then you are free to create one, but
don't expect much support from the community when things are not working as
expected.
> We intend to let packages move at different feature-release speeds,
> and we can't have a KGS for each package.
You do not need to have a single KGS for every package. But believing that we
can just randomly make new feature releases that work with the rest of the
world is naive at best. We have seen already what happens, if everyone uses
their own set of versions and packages.
A development KGS will be used to test new feature releases.
> What KGS doesn't have is history.
Yes, it does. Why do you think I manage the "controlled-packages.cfg" file in
SVN? And in SVN, I do not create branches and tags without a reason.
> When I release an application or
> framework and I used KGS to make sure that all my versions were correct,
> it will work on the day of release. As soon as enough bugfix (or
> feature) releases make it to KGS, something will inevitably break. We've
> seen innocuous changes breaking code a lot of times, so we can't pretend
> that never happens. It *will* happen.
I agree. Have you read the discussion we had yesterday on the zope-dev mailing
list? We discussed the problem and possible solutions already.
Here are a couple of choices we have to avoid the problem:
1. During development I would recommend to use the index of the latest stable
release; or if you are brave, you can use the development KGS. (Of course,
you can also use the versions block of a particular release, though you will
miss out on bug fixes, which I think is less optimal.)
2. Once release/deployment time comes around, you lock the versions. There is
a wide range of possibilities:
(a) You download the "version.cfg" of the KGS at this time and maintain it
in your deployment code.
(b) You point to a particular release's "versions.cfg", for
example "versions-3.4.0.cfg". (I will start producing those starting
with the next Zope 3.4 release. Maybe I should create a versioned file
for "controlled-packages.cfg" as well?!)
(c) You use a particular SVN revision, download `zope.release` yourself and
generate the "versions.cfg" file, which is trivial. I already create
tags for releases there.
I probably would prefer option (b).
> This breaks a fundamental assumption for releases. When I release
> something, I expect it to work tomorrow, next month, and next year.
I agree. The KGS should be seen as a branch. Particular versions
of "versions.cfg" and maybe "controlled-packages.cfg" should be considered
releases.
> With code, we know that history, and branches, and so on, are important.
> We use Subversion. With KGS we only have an ongoing trunk.
No, as I said before, the KGS specification, which
is "controlled-packages.cfg", is maintained in SVN as well.
> With Grok, we use an external versions list. We can use this to solve
> the above problem. We basically take snapshots of what is in KGS. This
> allows us to maintain some history, though it isn't ideal either, as
> it's quite a bit of overhead.
How is this overhead?
> If I build an application or framework on top of Grok, I will need to
> maintain yet another external list for the extra packages of this
> application, fixing those versions.
Why? I don't follow that?
> We could probably even use the
> extends feature of buildout to have this list point at Grok's list so we
> have to repeat ourselves less should we want to build something on top
> of *that* application or framework again.
I don't understand what you are saying. However, I'll note that the KGS is
also extendable. For example, Grok can maintain its
own "controlled-packages.cfg" that extends a particular Zope
3 "controlled-packages.cfg". Extending also means that you have the choice of
overwriting a particular version requirement. (I have implemented this after
the discussion yesterday.) Having a "controlled-packages.cfg" does *not* mean
you need an index. This file can be used to generate a "versions.cfg" file or
just a `[versions]` section for buildout.
> So, while annoying, that is somewhat manageable. Now imagine I want to
> use a completely separate Python library with my Grok application. This
> python library has dependencies itself again. This means I will need to
> know about versions of those dependencies as well, and fix them into my
> application's list.
Yes. I see this as an advantage. Version specifications in `setup.py` usually
contain ranges of allowed versions. What happens if one release in the range
does not work? Then you make false promises. The only way to avoid this would
be by specifying all allowed versions exactly, which makes no sense.
> There are some fundamental problems with external lists or indexes:
>
> * we need to know about the dependency of dependencies, even if we never
> use them directly. Information hiding is broken.
This is a requirement not a problem statement. I don't understand "Information
hiding is broken." I cannot see how "information hiding" could be a good
thing here.
> * a single list will never do it. We intend to have many different
> applications that may depend on different versions of packages. Grok may
> need a newer zope.publication than your application does. A Grok
> extension may need an even newer version than Grok does. We'll be baking
> endless amounts of lists this way.
I never claimed that we will have just one list.
Your assumption is that there will be an endless amount of working sets and
that they are easy to find. While theoretically possible, this will not
happen. I think the events of the past two months have shown that the
opposite happens and it is actually hard to find working sets, once
development digresses too much.
It is pretty easy to maintain a list for a particular application, especially
with having already base lists, like zope3.4 and grok, being available.
> If this information is inside the packages itself, the history will be
> automatically maintained with Subversion and existing releases. History
> therefore works: if I install Grok 0.11, I would get all dependencies of
> Grok 0.11 automatically without having to worry about external indexes.
This is the overly simplistic world view that we had about two months ago.
Because we are all developing on top of different stacks and people expect to
pick and choose, having a common foundation is the only possible way.
Let me give you a dire scenario.
Let's say you have a package A-1.0.0. You also have a package B-1.0.0 that
depends on A. You suggest fixing versions, so you would write in the dep list
of B: 'A => 1.0.0'. You now release a new feature version of A, A-1.1.0, that
is incompatible with B-1.0.0. So package B-1.0.0 will be broken until you
release B-1.0.1 that states 'A >= 1.0.0 and A <= 1.0.99'.
This problem here is that you have to re-release B only for the sake of
changing the version requirement. This is not so bad if you have one known
package to do this with. But in Zope's case we often 20, 30, 60, or even 80
packages that now have to be re-released. All this work for one update. I can
tell you that releasing this many packages is a very tedious job and very
error-prone. I have just spent 3-4 man-weeks releasing about 120 packages,
some multiple times.
But this is not your biggest problem. You cannot even assume that you can know
the full list of packages that need updating, because they are not public or
you are not aware of them.
And the worst about it all is that everyone will be blocked until the new
releases are all out _[1], unless they do some internal version nailing for
their application. Which brings us back to a KGS. Now, you could have simply
skipped specifying versions in 'setup.py' and get the same effect.
Specifying versions in the release makes you believe you are safe. But what
you are really doing is to impose one underspecified working set onto
everyone. But this working set is not universal; in fact it is very small,
since it does not consider a larger dependency network. The Zope 3 KGS
attempts to be a universal set for Zope 3, which other projects can built
upon.
.. [1] This did actually happened during the FoilageSprint. Remember all the
outrage?
> Information hiding works: if I use foo 1.3 and foo 1.3 knows it needs
> bar 1.7, it'll simply get that and I don't have to know about it. I
> don't even need to worry about the *existence* of bar.
With KGS, you do not need to know about it either. One assumption you make is
that bar 1.7 can be found in the default index (PyPI) or in any of the
dependency links locations. In the KGS we simply change this to use a
different index with the promise that only versions of bar are available that
actually work, which might be bar 1.6, bar 1.7, or bar 1.8.
> People have been saying that since Linux distributions use external
> indexes, we should too, as we are dealing with the same problem as Linux
> distributions.
It is true that our problem is very much like a Linux distribution, at least
my mind.
However, Jim and I did not choose the index approach because of Linux. We
simply thought about the easiest way to create a known-working-set that would
require the least amount of software to be written. So Jim had the idea of
simply limiting the amount of available versions of a package and noted that
it would be a fairly simple (now about 50 lines of code) to extend his ppix
tool to do that. Actually, the KGS is really only
the "controlled-packages.cfg" file and the `http://download.zope.org/zope3.4`
index is just one way to use the KGS. Other usages include the "versions"
section generator, the test buildout generator, and the Zope 3 tree updater.
> While the problem is similar, I think the nature of
> development makes our problems, and therefore our solutions, quite
> different from the way distributions do it.
>
> How are we different?
That's a good question. I think you forgot the most important one.
Manpower! In a Linux distribution, there are many contributors to just the
release process. In the Python/Zope world, the developer is the release
person too. Thus Linux distributions have the luxury that they can create a
new release for every package for every Linux distribution release.
One promise Jim and others made to me about eggs was that once we had a stable
set (which I worked on for the current KGS), releasing would become much
simpler, because many packages would not need new releases for very long
stretches at a time. With your suggestion, packages will need to be released
all the time.
> We have many, many different small distributions (package +
> dependencies) that can be combined. We have such a small distribution
> for each application. We have such a small distribution for each
> extension. Not just that. We have such a small distribution for each
> *release* of an application. We have such a small distribution for each
> *release* of an extension.
I think your proposal makes this a problem, whereby the KGS provides a
solution; of course, in combination with the versions and find-links options.
> I therefore still believe that version dependency information should
> move out of external indexes and into packages.
- 1 googol
Regards,
Stephan
--
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
More information about the Zope-Dev
mailing list