Bjorn Stabell wrote at 2003-10-24 10:52 +0800:
From: Dieter Maurer [mailto:dieter@handshake.de] [...] They can be far apart. Although, when your pickle is several MB your object is not several bytes and vice versa.
Well, in that case it might be useful for such a ZODB admin tool to show both sizes. It could be a combined cache analysis and ZODB browsing tool.
Python does not know how much memory is used by a Python object. Python was designed to hide such implementation details. More importantly, even when it would know the size of a single object, this would be (almost) irrelevant for your purpose. You want to know the size occupied by a complete persistent object (including its non-persistent subobjects but excluding persistent subobjects). The easiest way to get this information would be to extend pickling and accumulate the size during object load. However, Python is far too flexible (it allows C extensions to control pickling of their instances) to get that easily. Lot's of C extensions would have to be modified. Forget about this approach. It might come with Python 3 or Python 4, but it is unlikely. Python is a high level language hiding memory usage; you want precise information about memory usage. I doubt you will find enough arguments and use cases to get this into Python.
[...]
I do not yet understand why you would want such a thing. Can you provide use cases?
I guess I want something very low-level, for use in debugging strange behavior, and for help in understanding how Zope apps are built. The ZMI works with object interfaces, which is useful, but requires that each object supports an interface (ObjectManager etc). Many objects don't, especially not when you're developing them :) For this reason--no access to data except through application-provided interfaces--ZODB feels much like a "black box" to me.
Do you really care about the size of objects in memory? We no longer live in 1980 when memory has been a scarce resource. If you care about more Pythonic aspects (attributes, methods, size of dictionaries, lists, ...), then you can read a "HowTo" about "Debugging Zope" (--> "Zope.org"). You can access each object in the ZODB and use Python's inspection facilities ("--> Python Library Reference") to analyse its attributes and methods and call its methods interactively (to find out about application specific things).
Example use cases include:
- SPACE AND MEMORY OPTIMIZATION. Reducing ZODB size, and Zope memory usage. I've got some huge objects in my database, which ones are they? Why are they huge? If I know this, I can optimize.
I had a similar problem (the ZODB grew far too fast) and I wanted to understand why. I extended Zope's "Undo" information to include the transaction size. This allowed me to see precisely which transactions were larger than expected. I extended the "fsdump" utility to include the (pickle) sizes of the object records contained in a transaction and to restrict the range of dumped transactions. This has been enough to analyse the problem: ZCatalog's Metadata records caused a transaction size to grow from an expected few hundred bytes to about 500 kB. You can use the same approach to analyse ZODB size problems. Note, that I do not care about memory size. Usually, a persistent object uses memory in the same order as its pickle size (there are exceptions, but they can usually be ignored). With a GB RAM costing about 200 USD, we can ignore the differences between memory size and pickle size.
(Related: My Zope uses a lot of memory; why? Objects of which class, and in which location, uses the most memory? Why were they loaded? At the ZMI level, you don't want to know if objects are loaded/ghosted.)
When your Zope uses a lot of memory, then either you have large numbers of large persistent objects in the ZODB caches (see "Control_Panel --> Database Management" about information of your ZODB caches) or you have memory leaks. Large persistent objects are revealed by large transactions (when written). You can use the above mentioned techniques to analyse transaction sizes and pickle sizes. Memory leaks can be spotted via "Control_Panel --> Debug Information --> Reference Counts" or Shane's "LeakFinder" product.
- UNDERSTANDING CONTENT TYPES AND TOOLS. What is the difference in the data structure of PloneDocument and CMFDocument?
Use a debugger, "DocFinder", the sources.
- DEBUGGING CLASS MIGRATION PROBLEMS. Some older objects are exhibiting strange behavior; what is the difference in data structure between them and the new objects? Which objects of class X doesn't have attribute A set? (Except for the obvious Zope/CMF/Plone/Product upgrades, these kinds of problems happen a lot during development each time a class is changed without recreating its objects.)
Use a debugger, the sources.
- VERIFYING AND IMPROVING DATA STORAGE SCHEMAS. Does class X really store the attributes I thought it would?
In Python (at least until 2.2), attributes are usually stored in instances not in classes. This makes such an analyis quite difficult. You have to look at the sources or maybe lots of instances and perform (unsafe) inductional reasoning.
- CHANGING A PROPERTY THAT DOESN'T HAVE A MANAGEMENT INTERFACE. For debugging, testing, or migration purposes, or for just fixing a one-off bug.
Use a debugger.
The truth is, I think this kind of tool will "open my eyes" to what's in the ZODB and take much of the guesswork out of developing with Zope, similar to the eye opening experience a RDBMS admin tool is.
I do *not* think that the admin tool gives you precise information about memory size. This is too low a level, also for a relational database. Apart from that (memory size), you can already now use a debugger to get the information you want. You can use Python's inspection facility to find out the relevant information and implement an UI for this. My "DocFinder" product does this for class attributes and methods (but not instance attributes). Look at it when you need an extension for instance attributes. <http://www.dieter.handshake.de/pyprojects/zope>
I often write scripts to analyze object structure and do simple changes; I wish, however, that there would be an admin environment that provided:
- A database browsing / object inspector tool, taking away the need to write scripts for browsing/changing objects in most cases, and encouraging people to analyze and understand the database structure
It do not write scripts for this but use the debugger interactively.
- A query language that would make it even easier to write scripts (+ ZODB index support?)
Use ZCatalog and its query language.
- A place to store one-off scripts so they don't get mixed up with the application
A "Folder", when you really need the scripts.
Something like this http://www.pgadmin.org/pgadmin3/screenshots.php, but for the ZODB.
A relational database has only a few classes ("sequence", "table", "index", "constraint", ...) and all these classes are known to the framework. The ZODB in contrast has an unbounded number of classes unknown to the framework. It is far more flexible than the relational framework. Therefore, it is much more difficult to provide (useful) inspection facilities. You can get low level inspection the way I outlined above (extend "DocFinder" for instance attributes). I do not think it is useful (therefore, I have not implemented it for "DocFinder") but you may see it differently. -- Dieter