[Zope] Zope needs this (and Dynamo has it)

Mon, 6 Mar 2000 03:02:10 +0100

And here follows part two of my follow-up. (I'm all thumbs today -- why
is Alt+S so damn accessible on a PC keyboard? ;-)

> From: Thomas Stenhaug [mailto:thomas@src.no]
> Sent: Sunday, March 05, 2000 2:34 PM
> To: Alexander Staubo
> Cc: Zope Mailing List (E-mail)
> Subject: Re: [Zope] Zope needs this (and Dynamo has it)
>
> | This situation has nothing to do with open vs. closed, either. Just
> | like Zope we're talking about proprietary (as opposed to standarized
> | -- DTML may be open, but it's still proprietary)
> 
> I don't quite grok what "the situation" is referring to, 

Read the preceding paragraph, about Java and the "cost of training a
person in the arts of Zope".

>but I think
> you're wrong here, or at least mistaken regarding the meaning of
> "proprietary".  Proprietary, according to Merriam-Webster, means this:
> 
> [cut from Merriam-Webster]
> Function: adjective
> Etymology: Late Latin proprietarius, from Latin proprietas property --
>            more at PROPERTY
> Date: 1589
> 1 : of, relating to, or characteristic of a proprietor 
> <proprietary rights>
> 2 : used, made, or marketed by one having the exclusive legal right 
>     <a proprietary process>
> 3 : privately owned and managed and run as a profit-making
>     organization  <a proprietary clinic> 
> [/]

I mean proprietary as being controlled by a private party -- owned, in a
very liberal sense -- as opposed to being standarized. XML is not
proprietary. DTML and Java are.

I'm not as knowledge about IP laws as I would like to be, so I can't
really defend my use of this word or whether DTML is an intellectual
property of Digital Creations. Perhaps somebody else more suited to the
task could clarify.

> [...]
> | and Zope, despite its many powerful features, is not big enough for
> | truly big things.
> |
> | This is a two-part problem. The first part is that Zope's bullshit
> | factor is too low; 
> [...]
> | We tech people know better, we know a shiny surface tells us nothing
> | about the engine underneath, but we tech people are not holding the
> | crucial component -- those big brown bags of money.
> 
> I'm not sure if I agree it's a problem, then, that Zope lacks in the
> big-department. :-)

If the suits in managements (assuming you work in such a place) tell you
to go learn Dynamo because that's what they want to use, wouldn't you
say it's a problem?

> | [under Zope's surface] lies a powerful engine with scalability
> | problems.
> 
> What part of Zope's engine are you referring to here?  Do you mean
> Python in general, are some specific part?  Like ZODB?

I definitely believe that Python is Zope's largest problem right now.
It's also, from a technical point of view, one of it's biggest assets --
Zope is becoming the killer application of Python, and might eventually
become the driving force to evolve Python, which is, I believe,
suffering from draught into the contributor department. Skilled
developers aren't flocking to Python to make it faster. Not right now,
anyway.

My hope is that Zope will change this, because Python has possibly one
of the slowest interpreters on earth. Python zealots -- and there is
such a beast -- will tell you that Python's performance is less of a
matter because what you sacrifice in speed you gain in power, ie.
computational expressiveness through a high-level, dynamic language.

For an example of a good, Python-like, dynamic language that is also
very fast, take a look at Dylan. In current implementations, Dylan is
not interpreted but rather compiled to native code, but in theory there
is nothing stopping anyone from writing a highly efficient Dylan
interpreter. Dylan's magic comes from its use of type hints, a system
which I had recognized as likely pivotal to improving Python's
interpreter long before I knew of Dylan. In Dylan, specifying a
variable's type is optional. When omitted, the Dylan compiler will
usually be able to infer the type from what you do with the variable
(ie., it sees you're assigning an integer to it; hence it can assume
that, for at least a stretch of execution, the variable remains an
integer). When you explicitly declare a variable's type, Dylan applies
the necessary code optimizations (such as using tight integer maths or
string copying) as well as static type checking. In Python, variables
can hold virtually anything, and this incurs a huge amount of checks.

Just imagine -- while Zope is running, thousands and thousands of the
same little checks are performed continuously, over and over again,
amounting to no uncertain overhead.

Python's biggest overhead comes from its dynamism -- the dynamic typing,
as explained, combined the object allocation system: most things is an
object, so a lot of objects are constantly spawned and torn down, often
in rapid succession because the compiler/interpreter can make few hard
analyses of when stuff is needed. Python also has a high function-call
overhead.

> | So far the only way to scale properly with Zope is a combination of
> | a load-balancing clustering system (eg., TurboLinux' TurboCluster
> | Server, which I can personally vouch for -- brilliant stuff -- or
> | Linux Virtual Server) with Digital Creations' ZEO, of which I've
> | only read about.
> 
> I actually thought of ZEO as a very nice way to scale Zope.

I agree.

> | There are Zope sites out there that are suffering scalability
> | problems (I won't mention specifics, other than that the project I'm
> | involved in is one of them).
> 
> It would be very much appreciated if you could share some specifics.

No. Ask around, do some research. Ask Digital Creations -- they probably
know of a few, although I'm guessing that they might feel it would hurt
their reputation to tell. Many companies using Zope might not want to
tell you because it would hurt their position, especially if they're
public companies.

Here's one specific case that I'm personally struggling with. I have a
page that returns a large number of dynamic database results in a
relatively simple table. The results are first munged by a small Python
script. This stuff is slow. Internally, as far as I can determine
(through shallow study of the database code), Zope allocates a row
object for every row you access -- even if the row has been returned
earlier on. Zope has a 1-row cache for cases when you repeatedly ask for
the same row in succession. Zope also has a row cache that is useful for
static result sets. Yet, for dynamic result sets, the processing
overhead is huge. The overhead isn't so higher than any other overhead
in other areas in Zope, but it's especially painful here, because it
affects the generation time of a single page, whereas other kinds of
overhead are more evenly distributed across multiple pages and multiple
users.

> [...]
> | As far as pollution goes, assuming Zope exports the appropriate data
> | -- in whatever form -- then support for
> | performance/monitoring/management APIs such as SNMP, PCP, WBEM, and
> | to a lesser extent PDH, aren't difficult to isolate into add-on
> | products.
> 
> You are right about that, and I think that's a better way to provide
> such support, rather then building specific support for any of them
> into Zope.

Exactly. However, the necessary counters must be built into Zope at some
point, and exposed through a common (XML? Python/XML-RPC?) API. Then we
could attach publishers for SNMP, HTML, XML, whatever.

Which reminds me. Does Zope have a mechanism for extending the
Control_Panel namespace other than through the Product sub-folder? I'm
too lazy to check, and I don't remember if this was ever discussed.

-- 
Alexander Staubo         http://alex.mop.no/
"`Ford, you're turning into a penguin. Stop it.'"
--Douglas Adams, _The Hitchhiker's Guide to the Galaxy_