[Zope-dev] __record_schema__ of Brains (Was: Record.pyd)

Sat, 10 Aug 2002 17:25:57 +0200

At 08:59 2002-08-09 -0400, Casey Duncan said:
>__record_schema__ is simply a dictionary which maps field names to column
>positions (ints) so that the record knows the index of each field in the
>record tuples.
>
>See line 154 of Catalog.py to see how it is initialized to the Metadata=20
>schema
>plus a few extra columns for catalog rid and scores.

Hi Casey (and zope-dev),
Thanks!
After some experimenting I realized that :-)

One of the reasons I was because I am thinking about
how to implement a "SELECT col1 as 'name', ... "type
of feature for ZCatalogs.

I'm not entirely sure it's an good idea to start with,  but
I'm thinking in the line of large ZCatalogs (by large I mean
allot of columns in the self.data structure).
If all columns are copied the brains would grow larger as well
and by selecting explicitly which columns should be copied to
the brain they would be lighter.

Now that I understand how the data tuples are copied to the brain
I'm not at all sure adding a filter when copying the tuple will optimize
thing, because of the overhead in the filter process.

(The way that I "solved" the group/calc part of my "project", I don't think
it will lead to memory bloat. I'm going to implement a LacyGroupMap
which take an extra parameter (a list of IISet). Each brain created
in the LacyMap will have methods for calculations directly on the self.data
in the Catalog. The data it self will not be stored.
There will most probably be a pre calculate method that calculate all
variables that are applicable and caches the result.)

One way to reduce memory consumption in wide Catalogs would be
to have LacyBrains (vertical lacyness, there might be reasons
why that would be a bad idea, which I'm not aware of)

Another way would be to have multiple data attributes in the Catalog, like
tables, and to join the tuples from them with a "from table1, table2"=20
statement.
In this way it would be possible to control the width of the brains.
It would also be possible for the object indexing it self to tell the=
 catalog
in which "tables" it should store meta data.

There have been some proposals (ObjectHub et al) which I read some
time ago. I didn't feel then that we what I was looking for.
Please tell me if there's been any proposals or discussions regarding this.

Regards,
Johan Carlsson

--=20
Torped Strategi och Kommunikation AB
Johan Carlsson
johanc@torped.se

Mail:
Birkagatan 9
SE-113 36  Stockholm
Sweden

Visit:
V=E4stmannagatan 67, Stockholm, Sweden

Phone +46-(0)8-32 31 23
Fax +46-(0)8-32 31 83
Mobil +46-(0)70-558 25 24
http://www.torped.se
http://www.easypublisher.com