why external version indexes don't fulfill all use cases for development
Hi there, I've been doing some more thinking about external version indexes (like Grok's versions.cfg on a URL, and like KGS) and why they won't solve all our problems. I have a new way to express it, so let me try it out on you all. What KGS solves is that it allows the ongoing development and testing of an integrated Zope 3. That is, there's a Zope 3 'trunk' of versions that keeps being updated as there are bugfix releases. I'm not sure what happens as soon as someone wants to make a new feature release of any package. Presumably they end up in KGS too. After all, we won't have a single Zope 3.4 and then a single Zope 3.5 for which we can create a new KGS. We intend to let packages move at different feature-release speeds, and we can't have a KGS for each package. Another problem KGS can solve is to add some release hygiene to the cheeseshop: do not remove old releases or overwrite them. What KGS doesn't have is history. When I release an application or framework and I used KGS to make sure that all my versions were correct, it will work on the day of release. As soon as enough bugfix (or feature) releases make it to KGS, something will inevitably break. We've seen innocuous changes breaking code a lot of times, so we can't pretend that never happens. It *will* happen. This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year. With code, we know that history, and branches, and so on, are important. We use Subversion. With KGS we only have an ongoing trunk. With Grok, we use an external versions list. We can use this to solve the above problem. We basically take snapshots of what is in KGS. This allows us to maintain some history, though it isn't ideal either, as it's quite a bit of overhead. If I build an application or framework on top of Grok, I will need to maintain yet another external list for the extra packages of this application, fixing those versions. We could probably even use the extends feature of buildout to have this list point at Grok's list so we have to repeat ourselves less should we want to build something on top of *that* application or framework again. So, while annoying, that is somewhat manageable. Now imagine I want to use a completely separate Python library with my Grok application. This python library has dependencies itself again. This means I will need to know about versions of those dependencies as well, and fix them into my application's list. There are some fundamental problems with external lists or indexes: * we need to know about the dependency of dependencies, even if we never use them directly. Information hiding is broken. * a single list will never do it. We intend to have many different applications that may depend on different versions of packages. Grok may need a newer zope.publication than your application does. A Grok extension may need an even newer version than Grok does. We'll be baking endless amounts of lists this way. If this information is inside the packages itself, the history will be automatically maintained with Subversion and existing releases. History therefore works: if I install Grok 0.11, I would get all dependencies of Grok 0.11 automatically without having to worry about external indexes. Information hiding works: if I use foo 1.3 and foo 1.3 knows it needs bar 1.7, it'll simply get that and I don't have to know about it. I don't even need to worry about the *existence* of bar. People have been saying that since Linux distributions use external indexes, we should too, as we are dealing with the same problem as Linux distributions. While the problem is similar, I think the nature of development makes our problems, and therefore our solutions, quite different from the way distributions do it. How are we different? We have many, many different small distributions (package + dependencies) that can be combined. We have such a small distribution for each application. We have such a small distribution for each extension. Not just that. We have such a small distribution for each *release* of an application. We have such a small distribution for each *release* of an extension. I therefore still believe that version dependency information should move out of external indexes and into packages. See also my earlier discussion of these problems and possible solutions: http://faassen.n--tree.net/blog/view/weblog/2007/09/26/0 Regards, Martijn
On Nov 11, 2007 8:06 AM, Martijn Faassen <faassen@startifact.com> wrote:
I therefore still believe that version dependency information should move out of external indexes and into packages.
This is at least the intuitive place for this information. My application requires Grok 0.11, which requires zope 3.4.0b2 which then would be a package that doesn't contain any code, just requirements of eggs that in turn has requirements of their own. I'm not even sure this *is* different from how the unices does it, but it just seems the obvious way of doing it. I would be interested in knowing if this has drawbacks. -- Lennart Regebro: Zope and Plone consulting. http://www.colliberty.com/ +33 661 58 14 64
On Sunday 11 November 2007, Lennart Regebro wrote:
On Nov 11, 2007 8:06 AM, Martijn Faassen <faassen@startifact.com> wrote:
I therefore still believe that version dependency information should move out of external indexes and into packages.
This is at least the intuitive place for this information. My application requires Grok 0.11, which requires zope 3.4.0b2 which then would be a package that doesn't contain any code, just requirements of eggs that in turn has requirements of their own. I'm not even sure this *is* different from how the unices does it, but it just seems the obvious way of doing it. I would be interested in knowing if this has drawbacks.
Meta-eggs are considered a bad idea in the Python world. I originally wanted to create a meta-egg, but Jim convinced my to use a different approach; hence the index. Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training
On Nov 11, 2007, at 6:11 PM, Stephan Richter wrote:
On Sunday 11 November 2007, Lennart Regebro wrote:
On Nov 11, 2007 8:06 AM, Martijn Faassen <faassen@startifact.com> wrote:
I therefore still believe that version dependency information should move out of external indexes and into packages.
This is at least the intuitive place for this information. My application requires Grok 0.11, which requires zope 3.4.0b2 which then would be a package that doesn't contain any code, just requirements of eggs that in turn has requirements of their own. I'm not even sure this *is* different from how the unices does it, but it just seems the obvious way of doing it. I would be interested in knowing if this has drawbacks.
Meta-eggs are considered a bad idea in the Python world. I originally wanted to create a meta-egg, but Jim convinced my to use a different approach; hence the index.
Meta eggs aren't a bad or a good idea by themselves. They are a good solution to some problems and a bad (or less good) solution to others. IMO, meta eggs are a good way to fix versions in *applications*. (I think buildout's version-specification mechanism is another good approach, with certain advantages and disadvantages). I think a package repository, of which a KGS is an example, is a good way to provide access to a collection of packages known to work together -- especially as it provides a nice way to manage bug fixes. I think "Zope 3" is better served by a well-managed repository, because Zope 3 is a platform, not an application. IMO, a well-managed KGS (set of KGS releases) will serve the community of developers who use Zope better than a rigid version specification. Jim -- Jim Fulton Zope Corporation
Previously Martijn Faassen wrote:
People have been saying that since Linux distributions use external indexes, we should too, as we are dealing with the same problem as Linux distributions. While the problem is similar, I think the nature of development makes our problems, and therefore our solutions, quite different from the way distributions do it.
How are we different?
We have many, many different small distributions (package + dependencies) that can be combined. We have such a small distribution for each application. We have such a small distribution for each extension. Not just that. We have such a small distribution for each *release* of an application. We have such a small distribution for each *release* of an extension.
I therefore still believe that version dependency information should move out of external indexes and into packages.
Unless I'm missing something that is exactly what Linux distributions are doing. Each package has its own list of dependencies and conflicts (just as important). When a package is uploaded to a distribution archive that information is copied out of the package and included in the distribution index. That is important since it allows you to grab the index and calculate the whole dependency graph without having to download packages. You can know in advance if something is installable without having to download dozens of pacakges and only then discovering that it will never work. Linux package managers can also handle multiple distributions. If you look at apt for example it can handle as many distributions as you want. You can set priorities for them at distribution-scale (ie always prefer packages from distribution X), at release scale (ie always prefer packages from release Y even if release Z has a newer version) or package scale (package A has to come from distribution X). This is extremely common. If you install a Debian or Ubunutu machine you will always use two distributions: the one for the release, which will never change, and one with security fixes for just that release. Often you will also configure distributions with specific backports (needed because Debian releases are far apart) or for specific products (for your Enlightement 17 snapshot for example which Debian does not have). The terminolpgy is slightly different (archive versus index, package versus egg, depends versus requires, enlightenment versus grok, etc.) but the problem is still the same. Wichert. -- Wichert Akkerman <wichert@wiggy.net> It is simple to make things. http://www.wiggy.net/ It is hard to make things simple.
Hi Martijn
Betreff: [Zope-dev] why external version indexes don't fulfill all use casesfor development
Hi there,
I've been doing some more thinking about external version indexes (like Grok's versions.cfg on a URL, and like KGS) and why they won't solve all our problems. I have a new way to express it, so let me try it out on you all.
What KGS solves is that it allows the ongoing development and testing of an integrated Zope 3. That is, there's a Zope 3 'trunk' of versions that keeps being updated as there are bugfix releases. I'm not sure what happens as soon as someone wants to make a new feature release of any package. Presumably they end up in KGS too. After all, we won't have a single Zope 3.4 and then a single Zope 3.5 for which we can create a new KGS. We intend to let packages move at different feature-release speeds, and we can't have a KGS for each package.
[...] I hope I can show you another point of view, but I'm not sure if this is understandable what I'll try to explain ;-) Yes, a KGS is a policy which makes sure that we can reproduce the dependency list and build a base for your custom development. Or we can use it as a base for reproducable bugfix. KGS is also comparable with a (daily, monthly or whatever) snapshot. And yes, there will be more then one KGS, there will be a development KGS that allows us to develope in a community. Because probably someone likes to develop 3.6.1 and other still work on 3.5.9. The KGS 3.4 reflects the tags folder compared to subversion and the KGS 3.5 dev will reflect the ongoing development compared with the subversion trunk. Anyway, a KGS is only a definition of what works with what. It doesn't matter if we call it KGS or something else, if you need to build grok or a custom set of eggs for your project you will need to know which version of eggs your project will use. That's the part what KGS can solve. Every egg version which is fixed in a package can break what you are trying to assamble. Because versions in eggs depend on the overall snapshot concept and don't know future versions of other eggs. The KGS can solve the problem because a KGS is a snapshot view on what you are trying to assamble. Eggs can't do that by itself. I'm 100% sure that we are not able to solve the dependencies at the package level. Or at least not without to restrict and lock down packages. Because you will lock down versions in zope package because grok will break but other projects do not. Let's give you a sample: The package zope.subscriber (3.5) defines a new subscriber and zope.catalog (3.5) uses this subscriber. And we have package zope.folder (3.5) which fires a notify for this subscriber. If you will use the new subscriber and it's automaticly handling you will define that all version must be 3.5. But t's also possible to use version (3.5) of zope.subscriber and implement in your custom container implementation the new subscriber pattern from zope.subscriber (3.5). The package zope.catalog and zope.folder can still be at version 3.4. Probably the sample above is not so good. But think about small zope.* package based distributions and the dependency to the ZODB package. I'm sure there it is possible to assamble many different versions of the ZODB egg within different versions within other zope.* packages. If any of them defines a version for ZODB, you will get very quickly into troubles. (You can still apply a patches if you like to use an older ZODB and if something doesn't fit) If we need to define versions, then a KGS is the concept which allows you to define this set. And this means that the versions defined in eggs are obsolate. I guess Stephan implemented this feature yesterday. Fazit, If we like to see different Zope 3 based distributions like Zope 3 itself, Grok or Z3Ext etc, it must be possible to assamble all the package within different versions of zope.* packages. And then it doesn't make sense to fix version in packages, right? Stephan, do you know what I mean, was this understandable or can you give additional hints? Regards Roger Ineichen _____________________________ END OF MESSAGE
Regards,
Martijn
On Nov 11, 2007, at 2:06 AM, Martijn Faassen wrote:
Hi there,
I've been doing some more thinking about external version indexes (like Grok's versions.cfg on a URL, and like KGS) and why they won't solve all our problems. I have a new way to express it, so let me try it out on you all.
What KGS solves is that it allows the ongoing development and testing of an integrated Zope 3.
I see it addressing a more general problem of having a known good combination of components that work together. There's nothing Zope 3 specific about this.
That is, there's a Zope 3 'trunk' of versions that keeps being updated as there are bugfix releases.
That's not how I see it. As I've said before, I would model this on linux distributions, where each feature release has a repository of packages for that release, including bug fixes.
I'm not sure what happens as soon as someone wants to make a new feature release of any package.
They make a new release. At some point, someone will make a new KGS that incorporates this.
Presumably they end up in KGS too. After all, we won't have a single Zope 3.4 and then a single Zope 3.5 for which we can create a new KGS.
Why not? I would expect that there would be Zope 3.4 and Zope 3.5 KGSs. There might be additional KGSs that include some of the same components. Anyone can assemble a KGS if they think that in doing so, they can add value.
We intend to let packages move at different feature-release speeds, and we can't have a KGS for each package.
Of course not.
Another problem KGS can solve is to add some release hygiene to the cheeseshop: do not remove old releases or overwrite them.
I don't really understand this. Maybe you mean that a KGS can be a better alternative to the cheeseshop. I can certainly see that.
What KGS doesn't have is history. When I release an application or framework and I used KGS to make sure that all my versions were correct, it will work on the day of release. As soon as enough bugfix (or feature) releases make it to KGS, something will inevitably break. We've seen innocuous changes breaking code a lot of times, so we can't pretend that never happens. It *will* happen.
Yup. Which is why you should record versions you use.
This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year.
If you want this, then you can't rely on the KGS. When releasing our applications, we don't rely on a KGS. We fix all of the versions we're using. IMO, the KGS shouldn't try to solve this problem. A KGS should be helpful for developers and development frameworks. A KGS will be more useful if the quality remains high. A KGS is similar to a traditional monolithic release. After all, bug fix Zope releases have been known to break applications too.
With code, we know that history, and branches, and so on, are important. We use Subversion. With KGS we only have an ongoing trunk.
I'm not sure why you keep saying "trunk". I'm not sure if you are being imprecise, or if I'm missing something. There's no reason a KGS couldn't be managed with a revision control system. That might be a very good idea.
With Grok, we use an external versions list. We can use this to solve the above problem. We basically take snapshots of what is in KGS. This allows us to maintain some history, though it isn't ideal either, as it's quite a bit of overhead.
Yup. I think both KGSs and version lists are valid approaches. Each has different strengths and weaknesses.
If I build an application or framework on top of Grok, I will need to maintain yet another external list for the extra packages of this application, fixing those versions. We could probably even use the extends feature of buildout to have this list point at Grok's list so we have to repeat ourselves less should we want to build something on top of *that* application or framework again.
Yup.
So, while annoying, that is somewhat manageable. Now imagine I want to use a completely separate Python library with my Grok application. This python library has dependencies itself again. This means I will need to know about versions of those dependencies as well, and fix them into my application's list.
Yes
There are some fundamental problems with external lists or indexes:
* we need to know about the dependency of dependencies, even if we never use them directly. Information hiding is broken.
I'm not sure how this is a problem with version lists (external or otherwise) or indexes.
* a single list will never do it. We intend to have many different applications that may depend on different versions of packages. Grok may need a newer zope.publication than your application does. A Grok extension may need an even newer version than Grok does. We'll be baking endless amounts of lists this way.
I think each application will need to come up with a version list for each of it's releases. In development, an application can use an index or external version list as a starting point. For example, I see a KGS being useful as a (fairly) stable baseline for development. When an application is ready for release, it should fix it's versions. I've tried to make this easy to do with buildout. When you're preparing to make a release, run buildout in verbose mode (-v) It will print out the versions it picked in a format that is easily turned into a version list.
If this information is inside the packages itself, the history will be automatically maintained with Subversion and existing releases. History therefore works: if I install Grok 0.11, I would get all dependencies of Grok 0.11 automatically without having to worry about external indexes.
Applications should maintain version lists. Frameworks are another matter. If you fix the versions in Grok, then it will harder for people to get bug fixes to packages you depend on. There's a trade off here of stability and flexibility.
Information hiding works: if I use foo 1.3 and foo 1.3 knows it needs bar 1.7, it'll simply get that and I don't have to know about it. I don't even need to worry about the *existence* of bar.
Yup.
People have been saying that since Linux distributions use external indexes, we should too, as we are dealing with the same problem as Linux distributions. While the problem is similar, I think the nature of development makes our problems, and therefore our solutions, quite different from the way distributions do it.
How are we different?
We have many, many different small distributions (package + dependencies) that can be combined. We have such a small distribution for each application. We have such a small distribution for each extension. Not just that. We have such a small distribution for each *release* of an application. We have such a small distribution for each *release* of an extension.
If this dependency information is in an easily override-able form, then I think this is a sane strategy. No one is advocating a KGS for each application. A KGS is merely an attempt to provide a stable ecosystem of components that work together. Just like Linux is a platform for delivering applications, a KGS is a platform for building applications.
I therefore still believe that version dependency information should move out of external indexes and into packages.
IMO, I think a KGS can provider a useful baseline for developers. Applications should also fix versions used for specific releases. Jim -- Jim Fulton Zope Corporation
On Sunday 11 November 2007, Jim Fulton wrote:
This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year.
If you want this, then you can't rely on the KGS. When releasing our applications, we don't rely on a KGS. We fix all of the versions we're using. IMO, the KGS shouldn't try to solve this problem. A KGS should be helpful for developers and development frameworks. A KGS will be more useful if the quality remains high. A KGS is similar to a traditional monolithic release. After all, bug fix Zope releases have been known to break applications too.
I really hope you will use the KGS as a starting point somewhen for your internal applications as well. :-) (Note that you can now override versions using the new "extends" feature that I shamelessly copied from buildout.) And I am not saying this to promote the KGS. I have a concrete example. Probably as part of a project, Benji did some development on zope.testbrowser. He fixed the versions of all dependencies in buildout.cfg. However, those versions were a version sub-graph of a ZC internal dependency graph that I do not have access to. It was also already pretty outdated referring to "dev" and "alpha" releases. So while testbrowser might be working with those dependency versions, it might still be broken for me, because I have a totally different dependency graph. The worst scenario, which luckily has not happened yet, is that we fix things back and forth because of different dependency graphs. I thus propose that all packages in svn.zope.org should use a KGS for testing, because it is a fully public dependency graph. I am not sure whether it should be the latest stable KGS or the development KGS or whatever. Time will provide an answer. BTW, Benji wanted me to bring this issue up on the mailing list already, so I fulfilled my commitment now. :-) Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training
On Nov 11, 2007, at 6:34 PM, Stephan Richter wrote:
On Sunday 11 November 2007, Jim Fulton wrote:
This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year.
If you want this, then you can't rely on the KGS. When releasing our applications, we don't rely on a KGS. We fix all of the versions we're using. IMO, the KGS shouldn't try to solve this problem. A KGS should be helpful for developers and development frameworks. A KGS will be more useful if the quality remains high. A KGS is similar to a traditional monolithic release. After all, bug fix Zope releases have been known to break applications too.
I really hope you will use the KGS as a starting point somewhen for your internal applications as well. :-) (Note that you can now override versions using the new "extends" feature that I shamelessly copied from buildout.)
And I am not saying this to promote the KGS. I have a concrete example.
Probably as part of a project, Benji did some development on zope.testbrowser. He fixed the versions of all dependencies in buildout.cfg. However, those versions were a version sub-graph of a ZC internal dependency graph that I do not have access to. It was also already pretty outdated referring to "dev" and "alpha" releases.
So while testbrowser might be working with those dependency versions, it might still be broken for me, because I have a totally different dependency graph. The worst scenario, which luckily has not happened yet, is that we fix things back and forth because of different dependency graphs.
I thus propose that all packages in svn.zope.org should use a KGS for testing, because it is a fully public dependency graph. I am not sure whether it should be the latest stable KGS or the development KGS or whatever. Time will provide an answer.
I think you make a good point. +1 on using *some* KGS. Jim -- Jim Fulton Zope Corporation
On Monday 12 November 2007, Jim Fulton wrote:
I thus propose that all packages in svn.zope.org should use a KGS for testing, because it is a fully public dependency graph. I am not sure whether it should be the latest stable KGS or the development KGS or whatever. Time will provide an answer.
I think you make a good point.
+1 on using *some* KGS.
Since we only have the Zope 3.4 KGS now, I think it would be the best one to use now. :-) The easiest way to do this is to add the following line to the "buildout" section of the package's `buildout.cfg` file: index = http://download.zope.org/zope3.4 (I know you know that Jim; it is for the benefit of people reading this mail. ;-) Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training
Stephan Richter wrote:
The easiest way to do this is to add the following line to the "buildout" section of the package's `buildout.cfg` file:
index = http://download.zope.org/zope3.4
(I know you know that Jim; it is for the benefit of people reading this mail. ;-)
I've been trying to follow this whole thread but it's been pretty high volume so apologies if I've missed something... If I specify index as above, how do I get other packages which may not appear in that index? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Withers wrote:
Stephan Richter wrote:
The easiest way to do this is to add the following line to the "buildout" section of the package's `buildout.cfg` file:
index = http://download.zope.org/zope3.4
(I know you know that Jim; it is for the benefit of people reading this mail. ;-)
I've been trying to follow this whole thread but it's been pretty high volume so apologies if I've missed something...
If I specify index as above, how do I get other packages which may not appear in that index?
You install them in a separate transaction: the 'index_url' setting in a pacakge setup.py only governs where setuptools goes to find packages which are dependencies of that package. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHONmM+gerLs4ltQ4RAoGwAJ4lTOkIgbQxtexoXx+4MEB638ShigCfVx7z XnneNgqnqZ7x65ph1HXuaVI= =O713 -----END PGP SIGNATURE-----
Tres Seaver wrote:
If I specify index as above, how do I get other packages which may not appear in that index?
You install them in a separate transaction: the 'index_url' setting in a pacakge setup.py only governs where setuptools goes to find packages which are dependencies of that package.
Is this a buildout thing? I've never known of transactions in anything to do with setuptools... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Withers wrote:
Tres Seaver wrote:
If I specify index as above, how do I get other packages which may not appear in that index? You install them in a separate transaction: the 'index_url' setting in a pacakge setup.py only governs where setuptools goes to find packages which are dependencies of that package.
Is this a buildout thing?
I've never known of transactions in anything to do with setuptools...
I wasn't literally referring to a "transaction", in the ZODB sense -- I meant, "install using a separate run of easy_install'. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHObek+gerLs4ltQ4RAqshAJ0S2NM1w7ypM/r67fGRV6bD77p4NQCgr/5V ZzD/tfg+OcW3We3w+rYSzHw= =IhLV -----END PGP SIGNATURE-----
Tres Seaver wrote:
I've never known of transactions in anything to do with setuptools...
I wasn't literally referring to a "transaction", in the ZODB sense -- I meant, "install using a separate run of easy_install'.
Unless I'm missing something, that seems... sub-optimal. So I have to do easy_install package x, edit a config file, then easy_install package y, rinse and repeat?! cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Withers wrote:
Tres Seaver wrote:
I've never known of transactions in anything to do with setuptools... I wasn't literally referring to a "transaction", in the ZODB sense -- I meant, "install using a separate run of easy_install'.
Unless I'm missing something, that seems... sub-optimal.
So I have to do easy_install package x, edit a config file, then easy_install package y, rinse and repeat?!
No, you override the index on the command line: $ bin/easy_install --index_url=<index for package x> x $ bin/easy_install --index_url=<index for package y> y Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHOwvX+gerLs4ltQ4RAtZtAJ46QFuYUPDNFG2HYWpsjnep66YsjACfYHRF zDh2d6LkWmW7sSx1xS0qvpk= =fWG6 -----END PGP SIGNATURE-----
Chris Withers wrote at 2007-11-14 09:14 +0000:
Tres Seaver wrote:
I've never known of transactions in anything to do with setuptools...
I wasn't literally referring to a "transaction", in the ZODB sense -- I meant, "install using a separate run of easy_install'.
Unless I'm missing something, that seems... sub-optimal.
So I have to do easy_install package x, edit a config file, then easy_install package y, rinse and repeat?!
A call with different options is sufficient.... -- Dieter
Dieter Maurer wrote:
Chris Withers wrote at 2007-11-14 09:14 +0000:
Tres Seaver wrote:
I've never known of transactions in anything to do with setuptools... I wasn't literally referring to a "transaction", in the ZODB sense -- I meant, "install using a separate run of easy_install'. Unless I'm missing something, that seems... sub-optimal.
So I have to do easy_install package x, edit a config file, then easy_install package y, rinse and repeat?!
A call with different options is sufficient....
That still seems a bit ropey... Have all the things that have lead to buildout/kgs/etc been brought to the attention of the distutils sig? This *must* be a problem that all decent sized python frameworks are facing. How are other communities solving this problems? What's the "standard" way of solving this problem in the python world? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
On Thursday 15 November 2007, Chris Withers wrote:
Have all the things that have lead to buildout/kgs/etc been brought to the attention of the distutils sig? This *must* be a problem that all decent sized python frameworks are facing.
We are still experimenting. I think once we tried our solutions out a little bit more, we can talk to the distutil sig group.
How are other communities solving this problems?
I bet you they don't have them.
What's the "standard" way of solving this problem in the python world?
I don't think there is one. Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training
Stephan Richter wrote:
On Thursday 15 November 2007, Chris Withers wrote:
Have all the things that have lead to buildout/kgs/etc been brought to the attention of the distutils sig? This *must* be a problem that all decent sized python frameworks are facing.
We are still experimenting. I think once we tried our solutions out a little bit more, we can talk to the distutil sig group.
Why not engage with them at this stage? There are plenty of bright people there I'd bet, and the more people thinking about this the better, right? From my perspective, it would be great if whatever turns out to be the right solution gets built in at the python library level, not the zope level.
How are other communities solving this problems?
I bet you they don't have them.
I'd be surprised if twisted, paste and the like haven't bumped into this problem ;-)
What's the "standard" way of solving this problem in the python world?
I don't think there is one.
Then surely disutils-sig is the place to come up with one? (thinking of Martijn's recent comments about "drastically improv[ing] both the quality and quantity of our communication with the rest of the world) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Hey, On Nov 11, 2007 10:34 PM, Jim Fulton <jim@zope.com> wrote:
On Nov 11, 2007, at 2:06 AM, Martijn Faassen wrote:
[snip]
This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year.
If you want this, then you can't rely on the KGS. When releasing our applications, we don't rely on a KGS. We fix all of the versions we're using. IMO, the KGS shouldn't try to solve this problem. A KGS should be helpful for developers and development frameworks. A KGS will be more useful if the quality remains high. A KGS is similar to a traditional monolithic release. After all, bug fix Zope releases have been known to break applications too.
I got completely confused by your answers you gave previously: you were talking about feature releases, but of what? Basically here you say that KGS replaces a monolithic release of Zope 3. I see KGS as useful for the developers of Zope 3 classic. I see KGS as a useful source of tested lists of versions where they are related to Zope 3.
With code, we know that history, and branches, and so on, are important. We use Subversion. With KGS we only have an ongoing trunk.
I'm not sure why you keep saying "trunk". I'm not sure if you are being imprecise, or if I'm missing something. There's no reason a KGS couldn't be managed with a revision control system. That might be a very good idea.
I say this as this is the impression I get from it. Saying that a KGS could be managed with a revision control system is nice, but can it now? How complicated would that make it? Does it make sense to maintain this information externally to the packages? [snip]
There are some fundamental problems with external lists or indexes:
* we need to know about the dependency of dependencies, even if we never use them directly. Information hiding is broken.
I'm not sure how this is a problem with version lists (external or otherwise) or indexes.
Dependencies of dependencies itself isn't a problem, as this information is still in packages themselves. The versions of dependencies isn't, and this is a problem, because I don't *want* to know about dependencies of dependencies, or their versions. I don't want to have to care. The packages themselves should know this. The story for beginners wouldn't be good enough, as they'd need to know too. With ZCML we're finally resolving this by putting this dependency structure in ZCML. Not ideal, but at least when you include package X which needs Y which needs Z, you don't need to manually include the ZCML of Y and Z anymore. Now with package dependencies and versioning, I need to make decisions on the versions of Y and Z, while I just care about using X. I don't want to know this stuff. A beginner can of course, if he's lucky, interact with an index like KGS that makes decisions for them. That works until they need a different version or different package than what is maintained in the index. In that case, they don't want anything external to make the decisions. The basic thing I'd like is to just to ask the package: give me the versions *you* think you can work with. If I'm tracking this package in subversion or upgrade to a new release of the package, this list of best versions might change, too. I don't want to have to know, just like I don't want to have to know about the implementation details of a package. Dependencies and the versions of such are an implementation detail. One I might on occasion like to override, just like I sometimes need to override the implementation details of a class by subclassing, but that should be further along the curve, not immediate. I think it would be reasonable for packages higher up in the dependency tree to have the ability to override version decisions made by packages lower down (as long as there aren't any conflicts within the structure). In these case these package explicitly decide to take over responsibility.
* a single list will never do it. We intend to have many different applications that may depend on different versions of packages. Grok may need a newer zope.publication than your application does. A Grok extension may need an even newer version than Grok does. We'll be baking endless amounts of lists this way.
I think each application will need to come up with a version list for each of it's releases. In development, an application can use an index or external version list as a starting point. For example, I see a KGS being useful as a (fairly) stable baseline for development. When an application is ready for release, it should fix it's versions. I've tried to make this easy to do with buildout. When you're preparing to make a release, run buildout in verbose mode (-v) It will print out the versions it picked in a format that is easily turned into a version list.
Sure, I know about all this. I just am saying that this doesn't do enough. During development I decide I want to rely on, X, which needs Y which needs Z. Moreover, this is not in KGS, or I need newer or older versions than those in KGS. I rely on these purely in my own application. I will now need to figure out that there is indeed a dependency structure for X (I don't care about Y and Z), and I need to figure out which versions for these would be best, and make some kind of random guess (as the information is nowhere to be found in the packages. If I'm lucky it's in human-readable documentation, that's it), perhaps puzzle them out from the buildout run, and bake them into my versions list. I think this procedure is fundamentally wrong as it puts responsibilities with the developer that the developer shouldn't have. The responsibilities about the best versions that fit with X are with the developers of X, unless I should explicitly decide to override their decision. The only thing the developer should have to choose is the best version of X they want to use. If I want to use an older version of X, I should be able to do this, too. If I have to maintain my own lists for this, the chances that these will eventually be the wrong lists, the out of date lists, the "I changed the version for X in my list, but oops, I was also supposed to update Y and Z" mistakes, and so on. As far as I can see, if this information is inside the packages themselves, I can pick any version of the the package and have the right list, automatically. I don't need to worry about external indexes or historical versions.
No one is advocating a KGS for each application. A KGS is merely an attempt to provide a stable ecosystem of components that work together. Just like Linux is a platform for delivering applications, a KGS is a platform for building applications.
Yes, and I'm not debating *against* a KGS. It obviously works better than anything else we have right now for what it does. I'm just debating that it isn't *sufficient*. In addition, I'm debating that what applications and frameworks are starting to do now (external lists of versions for themselves) is also sub-optimal. Finally, I am also wondering whether externally is really the right place to maintain this information in the first place. That's not to say I want KGS to disappear today, but I do want to avoid getting locked into a strategy forever without further consideration of the alternatives. Regards, Martijn
On Sunday 11 November 2007, Martijn Faassen wrote:
What KGS solves is that it allows the ongoing development and testing of an integrated Zope 3. That is, there's a Zope 3 'trunk' of versions that keeps being updated as there are bugfix releases. I'm not sure what happens as soon as someone wants to make a new feature release of any package. Presumably they end up in KGS too.
Absolutely not! Like Linux distributions, there will be a KGS for every Zope 3 release. I have already requested a new directory called "zope-dev" where new feature releases can be tested.
After all, we won't have a single Zope 3.4 and then a single Zope 3.5 for which we can create a new KGS.
Yes, we will. Why do you think the current KGS is called "zope3.4"? If you want to have a different working set, then you are free to create one, but don't expect much support from the community when things are not working as expected.
We intend to let packages move at different feature-release speeds, and we can't have a KGS for each package.
You do not need to have a single KGS for every package. But believing that we can just randomly make new feature releases that work with the rest of the world is naive at best. We have seen already what happens, if everyone uses their own set of versions and packages. A development KGS will be used to test new feature releases.
What KGS doesn't have is history.
Yes, it does. Why do you think I manage the "controlled-packages.cfg" file in SVN? And in SVN, I do not create branches and tags without a reason.
When I release an application or framework and I used KGS to make sure that all my versions were correct, it will work on the day of release. As soon as enough bugfix (or feature) releases make it to KGS, something will inevitably break. We've seen innocuous changes breaking code a lot of times, so we can't pretend that never happens. It *will* happen.
I agree. Have you read the discussion we had yesterday on the zope-dev mailing list? We discussed the problem and possible solutions already. Here are a couple of choices we have to avoid the problem: 1. During development I would recommend to use the index of the latest stable release; or if you are brave, you can use the development KGS. (Of course, you can also use the versions block of a particular release, though you will miss out on bug fixes, which I think is less optimal.) 2. Once release/deployment time comes around, you lock the versions. There is a wide range of possibilities: (a) You download the "version.cfg" of the KGS at this time and maintain it in your deployment code. (b) You point to a particular release's "versions.cfg", for example "versions-3.4.0.cfg". (I will start producing those starting with the next Zope 3.4 release. Maybe I should create a versioned file for "controlled-packages.cfg" as well?!) (c) You use a particular SVN revision, download `zope.release` yourself and generate the "versions.cfg" file, which is trivial. I already create tags for releases there. I probably would prefer option (b).
This breaks a fundamental assumption for releases. When I release something, I expect it to work tomorrow, next month, and next year.
I agree. The KGS should be seen as a branch. Particular versions of "versions.cfg" and maybe "controlled-packages.cfg" should be considered releases.
With code, we know that history, and branches, and so on, are important. We use Subversion. With KGS we only have an ongoing trunk.
No, as I said before, the KGS specification, which is "controlled-packages.cfg", is maintained in SVN as well.
With Grok, we use an external versions list. We can use this to solve the above problem. We basically take snapshots of what is in KGS. This allows us to maintain some history, though it isn't ideal either, as it's quite a bit of overhead.
How is this overhead?
If I build an application or framework on top of Grok, I will need to maintain yet another external list for the extra packages of this application, fixing those versions.
Why? I don't follow that?
We could probably even use the extends feature of buildout to have this list point at Grok's list so we have to repeat ourselves less should we want to build something on top of *that* application or framework again.
I don't understand what you are saying. However, I'll note that the KGS is also extendable. For example, Grok can maintain its own "controlled-packages.cfg" that extends a particular Zope 3 "controlled-packages.cfg". Extending also means that you have the choice of overwriting a particular version requirement. (I have implemented this after the discussion yesterday.) Having a "controlled-packages.cfg" does *not* mean you need an index. This file can be used to generate a "versions.cfg" file or just a `[versions]` section for buildout.
So, while annoying, that is somewhat manageable. Now imagine I want to use a completely separate Python library with my Grok application. This python library has dependencies itself again. This means I will need to know about versions of those dependencies as well, and fix them into my application's list.
Yes. I see this as an advantage. Version specifications in `setup.py` usually contain ranges of allowed versions. What happens if one release in the range does not work? Then you make false promises. The only way to avoid this would be by specifying all allowed versions exactly, which makes no sense.
There are some fundamental problems with external lists or indexes:
* we need to know about the dependency of dependencies, even if we never use them directly. Information hiding is broken.
This is a requirement not a problem statement. I don't understand "Information hiding is broken." I cannot see how "information hiding" could be a good thing here.
* a single list will never do it. We intend to have many different applications that may depend on different versions of packages. Grok may need a newer zope.publication than your application does. A Grok extension may need an even newer version than Grok does. We'll be baking endless amounts of lists this way.
I never claimed that we will have just one list. Your assumption is that there will be an endless amount of working sets and that they are easy to find. While theoretically possible, this will not happen. I think the events of the past two months have shown that the opposite happens and it is actually hard to find working sets, once development digresses too much. It is pretty easy to maintain a list for a particular application, especially with having already base lists, like zope3.4 and grok, being available.
If this information is inside the packages itself, the history will be automatically maintained with Subversion and existing releases. History therefore works: if I install Grok 0.11, I would get all dependencies of Grok 0.11 automatically without having to worry about external indexes.
This is the overly simplistic world view that we had about two months ago. Because we are all developing on top of different stacks and people expect to pick and choose, having a common foundation is the only possible way. Let me give you a dire scenario. Let's say you have a package A-1.0.0. You also have a package B-1.0.0 that depends on A. You suggest fixing versions, so you would write in the dep list of B: 'A => 1.0.0'. You now release a new feature version of A, A-1.1.0, that is incompatible with B-1.0.0. So package B-1.0.0 will be broken until you release B-1.0.1 that states 'A >= 1.0.0 and A <= 1.0.99'. This problem here is that you have to re-release B only for the sake of changing the version requirement. This is not so bad if you have one known package to do this with. But in Zope's case we often 20, 30, 60, or even 80 packages that now have to be re-released. All this work for one update. I can tell you that releasing this many packages is a very tedious job and very error-prone. I have just spent 3-4 man-weeks releasing about 120 packages, some multiple times. But this is not your biggest problem. You cannot even assume that you can know the full list of packages that need updating, because they are not public or you are not aware of them. And the worst about it all is that everyone will be blocked until the new releases are all out _[1], unless they do some internal version nailing for their application. Which brings us back to a KGS. Now, you could have simply skipped specifying versions in 'setup.py' and get the same effect. Specifying versions in the release makes you believe you are safe. But what you are really doing is to impose one underspecified working set onto everyone. But this working set is not universal; in fact it is very small, since it does not consider a larger dependency network. The Zope 3 KGS attempts to be a universal set for Zope 3, which other projects can built upon. .. [1] This did actually happened during the FoilageSprint. Remember all the outrage?
Information hiding works: if I use foo 1.3 and foo 1.3 knows it needs bar 1.7, it'll simply get that and I don't have to know about it. I don't even need to worry about the *existence* of bar.
With KGS, you do not need to know about it either. One assumption you make is that bar 1.7 can be found in the default index (PyPI) or in any of the dependency links locations. In the KGS we simply change this to use a different index with the promise that only versions of bar are available that actually work, which might be bar 1.6, bar 1.7, or bar 1.8.
People have been saying that since Linux distributions use external indexes, we should too, as we are dealing with the same problem as Linux distributions.
It is true that our problem is very much like a Linux distribution, at least my mind. However, Jim and I did not choose the index approach because of Linux. We simply thought about the easiest way to create a known-working-set that would require the least amount of software to be written. So Jim had the idea of simply limiting the amount of available versions of a package and noted that it would be a fairly simple (now about 50 lines of code) to extend his ppix tool to do that. Actually, the KGS is really only the "controlled-packages.cfg" file and the `http://download.zope.org/zope3.4` index is just one way to use the KGS. Other usages include the "versions" section generator, the test buildout generator, and the Zope 3 tree updater.
While the problem is similar, I think the nature of development makes our problems, and therefore our solutions, quite different from the way distributions do it.
How are we different?
That's a good question. I think you forgot the most important one. Manpower! In a Linux distribution, there are many contributors to just the release process. In the Python/Zope world, the developer is the release person too. Thus Linux distributions have the luxury that they can create a new release for every package for every Linux distribution release. One promise Jim and others made to me about eggs was that once we had a stable set (which I worked on for the current KGS), releasing would become much simpler, because many packages would not need new releases for very long stretches at a time. With your suggestion, packages will need to be released all the time.
We have many, many different small distributions (package + dependencies) that can be combined. We have such a small distribution for each application. We have such a small distribution for each extension. Not just that. We have such a small distribution for each *release* of an application. We have such a small distribution for each *release* of an extension.
I think your proposal makes this a problem, whereby the KGS provides a solution; of course, in combination with the versions and find-links options.
I therefore still believe that version dependency information should move out of external indexes and into packages.
- 1 googol Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training
Hey, On Nov 12, 2007 12:02 AM, Stephan Richter <srichter@cosmos.phy.tufts.edu> wrote: [snip]
Like Linux distributions, there will be a KGS for every Zope 3 release. I have already requested a new directory called "zope-dev" where new feature releases can be tested.
Okay, I didn't understand that KGS is replacing the monolithic release story for Zope 3. That's fine as far as it goes. I was focused on the ability that eggs give us for packages to move at different speeds of evolution, and the desire to pick those eggs that we prefer in our applications. If you don't need that, then KGS is basically Zope 3 release + a few features. [snip]
We intend to let packages move at different feature-release speeds, and we can't have a KGS for each package.
You do not need to have a single KGS for every package. But believing that we can just randomly make new feature releases that work with the rest of the world is naive at best. We have seen already what happens, if everyone uses their own set of versions and packages.
Clearly things didn't work in the past. You can't just throw random versions of eggs together. We couldn't do anything better, as the information about what worked together was missing. KGS adds that information back, and that's great. But the information is external to the actual packages. This has drawbacks. I'm saying that if we add this information in the packages internally, we'll be better off, as historical versions and future versions can work. The reuse story is improved. You can make your own selection of versions and have a decent chance it will work together.
A development KGS will be used to test new feature releases.
What KGS doesn't have is history.
Yes, it does. Why do you think I manage the "controlled-packages.cfg" file in SVN? And in SVN, I do not create branches and tags without a reason.
Okay, that's a history. It's a history external to the packages, while the packages have their own history. It's also a global history, while packages can evolve independently. Development decisions of a package's development can change dependency information. It therefore seems natural to me that this information is maintained next to the package. If it's not, and zope.component starts to rely on a newer version of zope.interface, I'd need to maintain this centrally with KGS. We introduce a new monolithic structure where we just removed it. We add back explicitly what was there implicitly: an SVN trunk of Zope 3 maintaining versions that all work together. That's fine to retain the features Zope 3 development had, but I thought the point of splitting Zope 3 up was to be able to forget about the SVN trunk of Zope 3 and just worry about what's right for zope.component. [snip]
With Grok, we use an external versions list. We can use this to solve the above problem. We basically take snapshots of what is in KGS. This allows us to maintain some history, though it isn't ideal either, as it's quite a bit of overhead.
How is this overhead?
Besides releasing Grok, we also need to maintain snapshots of what is in KGS, make such changes as are needed, and publish them. Previously we just released new versions of Grok. That's increased maintenance and release overhead I'd like to get rid of again.
If I build an application or framework on top of Grok, I will need to maintain yet another external list for the extra packages of this application, fixing those versions.
Why? I don't follow that?
Because these packages may be of different versions that in KGS, or may not be managed by KGS altogether.
So, while annoying, that is somewhat manageable. Now imagine I want to use a completely separate Python library with my Grok application. This python library has dependencies itself again. This means I will need to know about versions of those dependencies as well, and fix them into my application's list.
Yes. I see this as an advantage. Version specifications in `setup.py` usually contain ranges of allowed versions. What happens if one release in the range does not work? Then you make false promises. The only way to avoid this would be by specifying all allowed versions exactly, which makes no sense.
That's true. Where I'd like to specify this is as near as possible to where I make the decision to fix these versions. In case of an application, that may be the application. Often that's not the case though: in the case of a library that uses these packages, I'd like it to be the library, and in case of a framework, I'd like it to be the framework. When I develop an application at most I'd like to get the warning: these packages still don't have fixed versions. I'd prefer that list to be empty.
There are some fundamental problems with external lists or indexes:
* we need to know about the dependency of dependencies, even if we never use them directly. Information hiding is broken.
This is a requirement not a problem statement. I don't understand "Information hiding is broken." I cannot see how "information hiding" could be a good thing here.
Sure, it's my requirement. I think it's important if you want beginners to be able to figure out what they're supposed to be doing. I think it's important for agile development, too. Feel free to ignore it entirely.
* a single list will never do it. We intend to have many different applications that may depend on different versions of packages. Grok may need a newer zope.publication than your application does. A Grok extension may need an even newer version than Grok does. We'll be baking endless amounts of lists this way.
I never claimed that we will have just one list.
Your assumption is that there will be an endless amount of working sets and that they are easy to find. While theoretically possible, this will not happen. I think the events of the past two months have shown that the opposite happens and it is actually hard to find working sets, once development digresses too much.
The events since july (which is when I started to have to deal with this problem) show that you need this information *somewhere*. The information was entirely missing! You can't use this as evidence that the only solution is to maintain lists external to the packages. [snip]
If this information is inside the packages itself, the history will be automatically maintained with Subversion and existing releases. History therefore works: if I install Grok 0.11, I would get all dependencies of Grok 0.11 automatically without having to worry about external indexes.
This is the overly simplistic world view that we had about two months ago.
Stephan, that's an overly simplistic description of history. In fact I have been thinking about this problem for a while now. A month and a half ago I wrote this: http://faassen.n--tree.net/blog/view/weblog/2007/09/26/0
Because we are all developing on top of different stacks and people expect to pick and choose, having a common foundation is the only possible way.
Let me give you a dire scenario. [snip dire scenario]
* eggs should state the minimum requirement (I'm C. I depend on B, or I depend on B, newer than version 3) and a suggestion (I depend on version 3.1. I know that, at least, works) * eggs should be able to override suggestions down below in the hierarchy. I depend on C, but I want B version 3.2. Forget what C suggested. * eggs shouldn't be able to say incompatible things. So you can't say, I depend on C, but I want B earlier than version 3, if C already says it needs version 3 or newer.
Let's say you have a package A-1.0.0. You also have a package B-1.0.0 that depends on A. You suggest fixing versions, so you would write in the dep list of B: 'A => 1.0.0'.
You now release a new feature version of A, A-1.1.0, that is incompatible with B-1.0.0. So package B-1.0.0 will be broken until you release B-1.0.1 that states 'A >= 1.0.0 and A <= 1.0.99'.
This problem here is that you have to re-release B only for the sake of changing the version requirement.
It doesn't exist if you do what I suggested above. It requires some changes to setuptools, which I have proposed a month and a half ago. [snip]
One promise Jim and others made to me about eggs was that once we had a stable set (which I worked on for the current KGS), releasing would become much simpler, because many packages would not need new releases for very long stretches at a time. With your suggestion, packages will need to be released all the time.
Luckily that was never my suggestion. Regards, Martijn
participants (9)
-
Chris Withers -
Dieter Maurer -
Jim Fulton -
Lennart Regebro -
Martijn Faassen -
Roger Ineichen -
Stephan Richter -
Tres Seaver -
Wichert Akkerman