Re: ploneout - Or how using zc.buildout for a common Zope2 project might look like
Whit pointed me to this thread. I won't reply to specifics, but maybe just describe what we're doing (and planning to do), and how workingenv differs from zc.buildout. So... recently we (The Open Planning Project -- more specifically Rob Miller) decided we need a better way to deploy our specific stack. It was hard, we couldn't tell other people how to do it, and we spent a significant amount of time debugging each other's installations, or chasing down bugs that we ultimately realized were just because of slightly different installations. I'm sure you all understand the motivation; it's similar to what buildout is for, though we were more focused on repeatability as a development tool than as a deployment tool. I actually tried to do this once before with zc.buildout, but I didn't get far -- probably a result of lack of effort and lack of familiarity with the overall stack. But I also recognize lots of the questions about stuff like the zope.conf file and Data.fs that still seem unresolved. The thing that frustrated me with zc.buildout is that I knew how to do what I wanted to do, but I still felt like I was a long way from being able to do it. And little things, like unhelpful error messages and frustratingly slow performance really killed my motivation. After setting that project aside someone else at TOPP (Luke Tucker) did a buildout for Deliverance because we needed to build some non-Python libraries and that was a feature of buildout; that did end up working eventually (after considerable effort), but it was not a very satisfying experience, and *using* the buildout was itself a real challenge. Since Deliverance is just a library, to do anything useful I also had to install another package that *uses* that library, and that was surprisingly difficult. Actually developing those libraries was even more frustrating. So then Rob decides to devote some time to deployment. And because Rob just wants to get this finished, he wrote the whole things from scratch. You can see the result of that here: https://svn.openplans.org/svn/topp.deploy/trunk -- it's not a generic setup tool, just what works for us. The end result is something that does everything we want, that we understand, and that we'll understand how to extend. An important difference from zc.buildout is that all the logic and work is in *our* script, not in a framework driven by a config file. In the future Rob plans to use buildit (http://cvs.plope.com/viewcvs/Packages/buildit/), and one of the motivations there is that it has handy routines to do what topp.deploy already does, but better. It's more like a library and less like a framework. I could make some generalization here but I'll restrain myself. The deployment script also uses workingenv, somewhat similarly because it is more library-like. Well, also because I work at TOPP and can easily support their use, which is certainly a nontrivial reason for the choice. Workingenv in this case is basically a tool that provides the isolation that we need to create something repeatable. This is the main feature overlap with buildout. Workingenv is not the framework in which topp.deploy is written, and workingenv is not intended as a framework. Note also that topp.deploy does not have the full set of features we'll ultimately need. You can't tell it to install another egg, or setup another script, or whatever. And we don't *have* to add those features, because workingenv is compatible with all the other tools. Where "all the other" is mostly easy_install. But someday there will be more, even if the progress is slow. buildout is basically incompatible with easy_install (the script). And frankly I like easy_install. It's probably 10x faster than buildout. easy_install is what people use in documentation, and its conventions are the ones people know (why does buildout not use "Pkg==version", for instance?). As for the technical reasons they don't work together: * workingenv allows and leaves it to setuptools to maintain the package installation database (basically easy-install.pth). This is not a very good database, but eh. buildout doesn't really have a database, but instead just enforces what buildout.cfg indicates. * workingenv relies on that database to give default versions and to setup the Python path. The fixup it does of installed scripts is fairly minimal, just setting up sys.path enough to force its site.py to get called. buildout enumerates all the activated packages, and ignores easy-install.pth. This is basically what makes it easy_install-incompatible. Plus buildout's desire to own everything and destroy everything it does not own ;) * As a result buildout supports multiple things in the same buildout that have conflicting version requirements, but where the packages themselves don't realize this (but the deployer does). If the packages know their requirements then setuptools' native machinery allows things to work fine. The solution with workingenv is to create multiple environments. Since the actual building happens higher up (e.g., topp.deploy), there's nothing stopping you from creating multiple environments from one deployment script. Anyway, in summary the way scripts are generated is one of the major incompatibilities between buildout and workingenv. In effect Buildout's jail is too strong for workingenv to penetrate, and buildout doesn't tell anyone else about what it is doing. * workingenv allows you to change versions without informing or using workingenv. Once you've created the environment you mostly stop using workingenv directly. * Some see bin/activate as a jail. Both workingenv and buildout are deliberately jail-like. Both Jim and I loathe the non-repeatability of system-wide installations (at least I think I can speak for him on that one point ;). bin/activate lets you into that jail, and lets you work there. There is no way into a buildout. Frankly this weirds me out, and is a big part of my past frustration with it. Maybe that's because I'm in the relatively uncommon situation that I actually know what's going on under the hood of Python imports and packaging, and so it bothers me that I can't debug things directly. Anyway, neither requires activation when using scripts generated in the environment. And bin/activate is really just something that sets PYTHONPATH and then does other non-essential things like changing the prompt and $PATH -- I should probably document that more clearly. Neither can be entirely compatible with a system-wide Python installation, because Python's standard site.py f**ks up the environment really early in the process, and avoiding that isn't all that easy. In some ways virtual-python is the more complete solution to all of this, and sometimes I think I should just use that technique (of a separate Python interpreter, and hence separate prefix) with some of the improvements I could take from workingenv. Anyway, this is my very long summary of why we aren't using buildout, and are using workingenv. Cheers, Ian
On Jan 25, 2007, at 5:09 PM, Ian Bicking wrote:
Whit pointed me to this thread.
Yeah, someone pointed me to it too. :)
I won't reply to specifics, but maybe just describe what we're doing (and planning to do), and how workingenv differs from zc.buildout.
I'll avoid responding to general qualitative statements. ...
I actually tried to do this once before with zc.buildout, but I didn't get far -- probably a result of lack of effort and lack of familiarity with the overall stack. But I also recognize lots of the questions about stuff like the zope.conf file and Data.fs that still seem unresolved.
Certainly when you tried this, buildout was very young and we hadn't written recipes to deal with these issues. We've made a lot of progress since then.
The thing that frustrated me with zc.buildout is that I knew how to do what I wanted to do, but I still felt like I was a long way from being able to do it. And little things, like unhelpful error messages
Yeah, buildout still needs to do a lot better with error messages, although it has probably made some progress since you tried it.
and frustratingly slow performance really killed my motivation.
That has improved quite a bit. ...
And frankly I like easy_install. It's probably 10x faster than buildout.
I doubt that that is true now. Although that probably depends on what you are doing. Early versions of buildout did a lot of things inefficiently as I was still learning setuptools. Because of the way that buildout caches index information, I expect that creating a buildout from scratch that used a lot of eggs would be much faster than using easy_install. One difference though is that buildout checks for the most recent compatible versions of all of the eggs it's using every time you run it, whereas, as I understand it, with workingenv, you'd just run easy_install manually when you want a new egg. You can bypass the checks by running in offline mode. Then buildout runs very fast. Because of the ability to share eggs accross buildouts, it is often possible to run a buidout using lots of eggs in offline mode. It has been suggested that there should be a mode for buildout that only talks to the network when there isn't a local egg that satisfied a requirement. This would make buildout work more like workingenv when few if any eggs are actually needed.
easy_install is what people use in documentation, and its conventions are the ones people know (why does buildout not use "Pkg==version", for instance?).
It does. When specifying eggs, you use standard setuptools requirement syntax.
As for the technical reasons they don't work together:
* workingenv allows and leaves it to setuptools to maintain the package installation database (basically easy-install.pth). This is not a very good database, but eh. buildout doesn't really have a database, but instead just enforces what buildout.cfg indicates.
buildout uses the buildout configuration file to store what you want. It uses .installed.cfg to capture what you have. These are both databases of sorts.
* workingenv relies on that database to give default versions and to setup the Python path. The fixup it does of installed scripts is fairly minimal, just setting up sys.path enough to force its site.py to get called. buildout enumerates all the activated packages, and ignores easy-install.pth. This is basically what makes it easy_install-incompatible.
Yup. I wanted something far more static and predictable for scripts generated by buildout.
Plus buildout's desire to own everything and destroy everything it does not own ;)
I'm not aware that it destroys anything. Could you be more specific?
* As a result buildout supports multiple things in the same buildout that have conflicting version requirements, but where the packages themselves don't realize this (but the deployer does). If the packages know their requirements then setuptools' native machinery allows things to work fine.
Yes. I expect that usually, packages won't be very specific. The buildout configuration file provides a place to be specific.
* Some see bin/activate as a jail. Both workingenv and buildout are deliberately jail-like. Both Jim and I loathe the non- repeatability of system-wide installations (at least I think I can speak for him on that one point ;). bin/activate lets you into that jail, and lets you work there. There is no way into a buildout.
I'm not familiar with bin/activate, but it sounds like an interpreter script created with buildout.
Frankly this weirds me out, and is a big part of my past frustration with it. Maybe that's because I'm in the relatively uncommon situation that I actually know what's going on under the hood of Python imports and packaging, and so it bothers me that I can't debug things directly. Anyway, neither requires activation when using scripts generated in the environment. And bin/activate is really just something that sets PYTHONPATH and then does other non-essential things like changing the prompt and $PATH -- I should probably document that more clearly.
sounds a lot like an buildout interpreter script.
Neither can be entirely compatible with a system-wide Python installation, because Python's standard site.py f**ks up the environment really early in the process, and avoiding that isn't all that easy.
This reminds me of a place where buildout is looser than workenv. buildout doesn;t try to disable anything in the system python. It just augments it. I always use a clean python, so avoiding customizations in the Python I use isn't a problem. If I wanted to take advantage of something in a system Python, as I occasionally do, I can do that with buildout. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Jim Fulton wrote:
I actually tried to do this once before with zc.buildout, but I didn't get far -- probably a result of lack of effort and lack of familiarity with the overall stack. But I also recognize lots of the questions about stuff like the zope.conf file and Data.fs that still seem unresolved.
Certainly when you tried this, buildout was very young and we hadn't written recipes to deal with these issues. We've made a lot of progress since then.
Well, the last time I really used it was early December, and it still felt slow and awkward to me at the time, with several funny quirks.
And frankly I like easy_install. It's probably 10x faster than buildout.
I doubt that that is true now. Although that probably depends on what you are doing. Early versions of buildout did a lot of things inefficiently as I was still learning setuptools. Because of the way that buildout caches index information, I expect that creating a buildout from scratch that used a lot of eggs would be much faster than using easy_install. One difference though is that buildout checks for the most recent compatible versions of all of the eggs it's using every time you run it, whereas, as I understand it, with workingenv, you'd just run easy_install manually when you want a new egg.
Correct. The basic process with workingenv is: 1. Set it up. 2. Start installing stuff. 3. Try running stuff. 4. Realize you got it wrong, missed something, want to do more development, return to 2. I actually find myself doing the 2-4 loop pretty often, both in development and when first deploying something. Just the amount of time to do "bin/buildout -h" was substantial (though I don't really understand why, except that buildout seemed to be working way too hard to update itself).
You can bypass the checks by running in offline mode. Then buildout runs very fast. Because of the ability to share eggs accross buildouts, it is often possible to run a buidout using lots of eggs in offline mode.
It has been suggested that there should be a mode for buildout that only talks to the network when there isn't a local egg that satisfied a requirement. This would make buildout work more like workingenv when few if any eggs are actually needed.
Yes; more like easy_install does as well, actually. Though the way easy_install works is hardly intuitive; I find myself frequently saying "yes, you installed it, but did you -U install it?"
As for the technical reasons they don't work together:
* workingenv allows and leaves it to setuptools to maintain the package installation database (basically easy-install.pth). This is not a very good database, but eh. buildout doesn't really have a database, but instead just enforces what buildout.cfg indicates.
buildout uses the buildout configuration file to store what you want. It uses .installed.cfg to capture what you have. These are both databases of sorts.
* workingenv relies on that database to give default versions and to setup the Python path. The fixup it does of installed scripts is fairly minimal, just setting up sys.path enough to force its site.py to get called. buildout enumerates all the activated packages, and ignores easy-install.pth. This is basically what makes it easy_install-incompatible.
Yup. I wanted something far more static and predictable for scripts generated by buildout.
Plus buildout's desire to own everything and destroy everything it does not own ;)
I'm not aware that it destroys anything. Could you be more specific?
Well, it owns parts, and the recipes control that. Doesn't it also delete and reinstall there? How it treats each area of the buildout I'm unclear. Simply making the file layout a bit more conventional, and describing anything non-obvious, would make buildout feel a lot more comfortable to the new user.
* As a result buildout supports multiple things in the same buildout that have conflicting version requirements, but where the packages themselves don't realize this (but the deployer does). If the packages know their requirements then setuptools' native machinery allows things to work fine.
Yes. I expect that usually, packages won't be very specific. The buildout configuration file provides a place to be specific.
workingenv allows this, insofar as you can be specific while installing things, and with the requirements file. But it doesn't make the individual scripts very specific, if for instance appfoo requires libX>1.0, and appbar requires libX>1.1, but you actually want appfoo to use libX==1.0 and appbar to use libX==1.1 and install them in the same buildout. That's the only case where buildout seems to be able to express something workingenv can't.
* Some see bin/activate as a jail. Both workingenv and buildout are deliberately jail-like. Both Jim and I loathe the non-repeatability of system-wide installations (at least I think I can speak for him on that one point ;). bin/activate lets you into that jail, and lets you work there. There is no way into a buildout.
I'm not familiar with bin/activate, but it sounds like an interpreter script created with buildout.
It's created by workingenv, and you have to source it because basically its only function is to add the workingenv/lib/pythonX.Y to $PYTHONPATH. Adding that path to $PYTHONPATH is the only thing that really "activates" a workingenv.
Frankly this weirds me out, and is a big part of my past frustration with it. Maybe that's because I'm in the relatively uncommon situation that I actually know what's going on under the hood of Python imports and packaging, and so it bothers me that I can't debug things directly. Anyway, neither requires activation when using scripts generated in the environment. And bin/activate is really just something that sets PYTHONPATH and then does other non-essential things like changing the prompt and $PATH -- I should probably document that more clearly.
sounds a lot like an buildout interpreter script.
Once you've changed $PYTHONPATH any Python script will notice the change. This can actually be a bit awkward if you have fully isolated the working environment, as it means a script may not see the global Python paths. But if you don't isolate the environment, the script can see the workingenv path in addition to its own.
Neither can be entirely compatible with a system-wide Python installation, because Python's standard site.py f**ks up the environment really early in the process, and avoiding that isn't all that easy.
This reminds me of a place where buildout is looser than workenv. buildout doesn;t try to disable anything in the system python. It just augments it. I always use a clean python, so avoiding customizations in the Python I use isn't a problem. If I wanted to take advantage of something in a system Python, as I occasionally do, I can do that with buildout.
I find the isolation useful when testing things for release; I can be sure that I haven't been using any packages that I don't explicitly include in the egg requirements or instructions. But it can be annoying in other cases, like when there's a library that doesn't install cleanly (of which there's still quite a few). Anyway, if you do want to include the global packages, --site-packages will change your workingenv to do so. It could be argued that workingenv's default should be to include site-packages. Another option would be to have a tool that allows you to easily include something from the system Python (probably just a tool to manage a custom .pth file, which works even when setuptools' fairly heroic attempts to fix broken setup.py's doesn't work). -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org
Ian Bicking wrote:
Jim Fulton wrote:
I actually tried to do this once before with zc.buildout, but I didn't get far -- probably a result of lack of effort and lack of familiarity with the overall stack. But I also recognize lots of the questions about stuff like the zope.conf file and Data.fs that still seem unresolved.
Certainly when you tried this, buildout was very young and we hadn't written recipes to deal with these issues. We've made a lot of progress since then.
Well, the last time I really used it was early December, and it still felt slow and awkward to me at the time, with several funny quirks.
Hm, It's a bit hard to respond to "awkward" and "quirks". I'll respond to performance issues a bit below.
And frankly I like easy_install. It's probably 10x faster than buildout.
I doubt that that is true now. Although that probably depends on what you are doing. Early versions of buildout did a lot of things inefficiently as I was still learning setuptools. Because of the way that buildout caches index information, I expect that creating a buildout from scratch that used a lot of eggs would be much faster than using easy_install. One difference though is that buildout checks for the most recent compatible versions of all of the eggs it's using every time you run it, whereas, as I understand it, with workingenv, you'd just run easy_install manually when you want a new egg.
Correct. The basic process with workingenv is:
1. Set it up. 2. Start installing stuff. 3. Try running stuff. 4. Realize you got it wrong, missed something, want to do more development, return to 2.
I actually find myself doing the 2-4 loop pretty often, both in development and when first deploying something. Just the amount of time to do "bin/buildout -h" was substantial (though I don't really understand why, except that buildout seemed to be working way too hard to update itself).
Ah yes. This is a good point. By default, buildout checks for newer versions of distributions for which there are open-ended requirements. This can take frustratingly long -- especially because pypi is so darn slow. One advantage of buildout over easy_install (and I assume workingenv) is that the eggs you get are deterministic by default. They are always the newest versions that satisfy your requirements. With easy_install, you get the most recently installed eggs that satisfy your requirements. This means that the eggs you have depend a lot on when you installed them. To achieve this, buildout looks for newer distributions when a requirement doesn't have an upper bound or when the upper bound isn't satisfied by an already-installed egg. I really should add a quick mode that skips looking for newer versions of requirements are met by what's already installed. This would make the iterative style you describe go much faster. I would certainly appreciate this myself. I will do this soon.
You can bypass the checks by running in offline mode. Then buildout runs very fast. Because of the ability to share eggs accross buildouts, it is often possible to run a buidout using lots of eggs in offline mode.
It has been suggested that there should be a mode for buildout that only talks to the network when there isn't a local egg that satisfied a requirement. This would make buildout work more like workingenv when few if any eggs are actually needed.
Yes; more like easy_install does as well, actually. Though the way easy_install works is hardly intuitive; I find myself frequently saying "yes, you installed it, but did you -U install it?"
In particular, upgrading a distribution doesn't upgrade it's dependencies. This makes it harder to control which distributions are used in an environment. With easy_install, even through distributions are automatically included by virtue of being dependencies, they aren't automatically updated. There's no way to say "I want the most recent version of everything". I wanted to make it easier to get the most recent version of the distributions used, which is why buildout has a different policy for looking up distributions. ...
Plus buildout's desire to own everything and destroy everything it does not own ;)
I'm not aware that it destroys anything. Could you be more specific?
Well, it owns parts, and the recipes control that. Doesn't it also delete and reinstall there?
Yes. Buildout tried to make a buildout reflect it's specification. This is an important feature. It uninstalls as well as installs. But it isn't controlling anything it wasn't asked to control.
How it treats each area of the buildout I'm unclear.
I can't help that. I've documented how this works in great detail.
Simply making the file layout a bit more conventional, and describing anything non-obvious, would make buildout feel a lot more comfortable to the new user.
What is conventional? Python uses different layouts on different systems. The Unix layout and Windows layout are quite different. When I came up with the layout for Zope installations, I tried to mimic the layout that Python used on Unix systems at the time, and then that layout changed. We were stuck with lib/python even though we never had anything else in lib. I chose a shallow layout in buildout following "flat is better than nested".
* As a result buildout supports multiple things in the same buildout that have conflicting version requirements, but where the packages themselves don't realize this (but the deployer does). If the packages know their requirements then setuptools' native machinery allows things to work fine.
Yes. I expect that usually, packages won't be very specific. The buildout configuration file provides a place to be specific.
workingenv allows this, insofar as you can be specific while installing things, and with the requirements file. But it doesn't make the individual scripts very specific, if for instance appfoo requires libX>1.0, and appbar requires libX>1.1, but you actually want appfoo to use libX==1.0 and appbar to use libX==1.1 and install them in the same buildout. That's the only case where buildout seems to be able to express something workingenv can't.
In practice, this can be very important. At least for us at ZC.
* Some see bin/activate as a jail. Both workingenv and buildout are deliberately jail-like. Both Jim and I loathe the non-repeatability of system-wide installations (at least I think I can speak for him on that one point ;). bin/activate lets you into that jail, and lets you work there. There is no way into a buildout.
I'm not familiar with bin/activate, but it sounds like an interpreter script created with buildout.
It's created by workingenv, and you have to source it because basically its only function is to add the workingenv/lib/pythonX.Y to $PYTHONPATH. Adding that path to $PYTHONPATH is the only thing that really "activates" a workingenv.
Ah, so it modifies the user's environment. I think that's a reasonable approach, although not one that I care for myself. To each his own. The important things here, IMO, is that both activate and buildout interpreter scripts let you get a Python interactive session or run scripts with a controlled path. ...
Neither can be entirely compatible with a system-wide Python installation, because Python's standard site.py f**ks up the environment really early in the process, and avoiding that isn't all that easy.
This reminds me of a place where buildout is looser than workenv. buildout doesn;t try to disable anything in the system python. It just augments it. I always use a clean python, so avoiding customizations in the Python I use isn't a problem. If I wanted to take advantage of something in a system Python, as I occasionally do, I can do that with buildout.
I find the isolation useful when testing things for release;
Yup. I do the same things by always using clean Python installs. I *never* use a system Python for development. I always use a Python that I build myself from sources. Your approach is certainly a valid alternative.
I can be sure that I haven't been using any packages that I don't explicitly include in the egg requirements or instructions.
Yup
But it can be annoying in other cases, like when there's a library that doesn't install cleanly (of which there's still quite a few).
Yes
Anyway, if you do want to include the global packages, --site-packages will change your workingenv to do so.
Cool.
It could be argued that workingenv's default should be to include site-packages. Another option would be to have a tool that allows you to easily include something from the system Python (probably just a tool to manage a custom .pth file, which works even when setuptools' fairly heroic attempts to fix broken setup.py's doesn't work).
<shrug> My remark was not meant to criticize or to suggest a different policy. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
participants (2)
-
Ian Bicking -
Jim Fulton