oodb philosophics ;) was: Re: [Zope-dev] Experiments withORMapping

Albert Langer Albert.Langer@Directory-Designs.org
Tue, 15 May 2001 08:26:29 +1000


[Karl Anderson]
Casey Duncan <cduncan@kaivo.com> writes:

> I am not arguing necessarily for SQL as a query language for the ZODB.
> Although it is an accepted standard, but not a perfect one by any means
> especially for OODBs. Its appeal lies mainly in the high level of
> community familiarity and the plethora of SQL software to borrow from.

Does anyone have an opinion on the possible usefulness of XPath,
XQuery, and other XML standards for this?  Someone suggested (on the
zope-xml wiki) that it would be nice to be able to drop in a cataloger
that supported a presumably standard and presumably well-known XML
query syntax, and which would work throughout the database because
Zope objects would support DOM.

This is all speculation, and I personally don't know much right now
about XML database interfaces and how finished or well-regarded they
are.

[Albert]
An excellent introduction to this topic is:

"Putting XML in context with hierarchical, relational, and
object-oriented models" by David Mertz.
ftp://www6.software.ibm.com/software/developer/library/x-matters8.pdf

Author is a python developer with lots of interesting XML stuff.
See also his xml_matters 1 and 2 for xml_object and xml_pickle with
much nicer "pythonic" syntax instead of using DOM directly.

Article is also *essential* background for the distinction between
"Object Mapping" and "Object Relational Mapping" which needs to be
understood by anyone participating in this discussion.

An example of a python ODBMS with some partial support for OQL is 4ODS
from 4 Suite, which uses a very natural "pythonic" syntax for objects
stored in and queried from PostgreSQL:

Following is from 4Suite-docs-0.11/4Suite-0.11/html/4ODS-userguide.html
available via:

http://4suite.org/download.html#4Suite_Documentation

vvvvvvvv
How to use the system (a very basic walk through)

First create a ODL file that represents what you want to store in test.odl

module simple {
  class person {
    attribute string name;
    attribute double weight;
    relationship Person spouse inverse Person ::spouse_of;
    relationship Person spouse_of inverse Person ::spouse;
    relationship list<Person> children  inverse Person ::child_of;
    relationship Person child_of inverse Person ::children;
  };
  class employee (extends person) {
    attribute string id;
  };
};

Now create a new database and initialize

 #OdlParse -ifp test test.odl

Now write some python code to do stuff with these people

#!/usr/bin/python

#Every thing that is persisten must be done inside a transaction and open
database
from Ft.Ods import Database
db = Database.Database()
db.open('test')

tx = db.new()
tx.begin()

#Create a new instance of some objects
import person
import employee
dad = employee.new()
mom = person.new()
son1 = person.new()
son2 = person.new()
daughter = person.new()

#Set some attributes
dad.name = "Pops"
mom.name = "Ma"
son1.name = "Joey"
son2.name = "Bobby"
daughter.name = "Betty"
dad.weight = 240.50

#We can set attributes not defined in the ODL but they will not persist
mom.address = "1234 Error Way"


#Set some relationships

#First set a one to one relationship
dad.spouse = mom

#Or we could have done it via the ODMG spec
#dad.form_spouse(mom)

#Add some children to the dad (our data model does not let mom have
children.  We'd need a family struct (left up to the reader)

dad.add_children(son1)
#We can create relationships both ways
son2.form_child_of(dad)

#Shortcut for adding
dad.children = daughter

#Now root the family to some top level object.
db.bind(dad,"The Fam")

#Make it so
tx.commit()

#Out side of a transaction we can still access the objects.
#However, any changes we make will not persist.
#NOTE, because 4ODS caches relationships, any relationships that were not
traversed during the
#transaction, cannot be traversed now because an object cannot be loaded
from the db outside
#of a transaction.
print dad.name

#Start a new tx to fetch

tx = db.new()
tx.begin()

newDad = db.lookup("The Fam")

print newDad.name
print newDad.children[0].name
print newDad.spouse

#Discard this transaction
tx.abort()

Ft/Ods/test_suite and Ft/Ods/demo are good places to look for more examples
^^^^^^

See also:
http://www.xml.com/pub/a/2000/10/11/rdf/

Some other relevant references are:

Extraction of DBMS catalogs to XML using python.
http://hyperschema.sourceforge.net./

PostgreSQL as XML repository
http://zvon.org/index.php?nav_id=61
http://hopla.sourceforge.net/doc/README

Note that none of this has much to do with the original topic of
Object-*Relational* Mapping.

*Essential* background for understanding what an object-relational
persistence layer looks like is:

http://www.ambysoft.com/persistenceLayer.html

It isn't very long and there *absolutely* isn't any point discussing
how to design such an OR persistence layer without first reading
and fully understanding it. (I say that after having carefully
studied all the messages in this discussion - though I also said
so before ;-)

The rest of that web site has lots of useful references.

Two examples of java open source code implementing it are:

http://osage.sourceforge.net/userguide.html

http://jakarta.apache.org/turbine/

Third is open source java bindings for XML, LDAP and
RDBMS with OQL and possible use of UML repository, also
likely to be helpful for above.
http://castor.exolab.org/

What Zopista's don't know about RDBMS would and does fill many
libraries. Here's one online book, just to emphasize that the
design of an RDBMS is a highly specialized technical area
(not recommended for introductory reading):

http://research.microsoft.com/pubs/ccontrol/

I strongly recommend careful study of *why* people (sometimes)
need to use a complex RDBMS *despite* the fact that ZODB is
so much more transparent and easier to use, before embarking
on attempts to implement RDBMS features within ZODB (with or
without the use of an actual RDBMS).

BTW, as I predicted, both the "great" article on "Why aren't you
using an ODBMS" referenced in Zope Weekly News and it's counterpart
on slashdot have now been deluged with replies explaining why not.

http://slashdot.org/features/01/05/03/1434242.shtml
http://www.kuro5hin.org/?op=displaystory;sid=2001/5/3/32853/11281
http://www.zope.org/Documentation/ZWN/ZWN-2001-05-04

The key issue is the difference between an embedded persistence
mechanism driven by application programs and an enforced schema
independent of the applications (from any platform) that may
be using it.

Maintaining data integrity with concurrent changes by many
users from many and rapidly evolving applications is a *hard*
problem, which doesn't get easier by ignoring it.

ZODB provides Atomic and Durable transactions.

An RDBMS provides ACID transactions - Atomic, Consistent,
Isolated and Durable.

When you need either Consistent enforcement of a schema or
Isolation (so that changes made by one transaction are not
based on data that is concurrently being modified by another
transaction), there is simply *no way* you can get that
from ZODB, regardless of "storage".

The easiest way to do an OR mapping for python is to start
with one that has already been half built through stored
procedures in the database, since Zopistas already know
how to do the other half far better but know very little
about the DBMS side. Since suitable free (as in speech) SQL
is available from ACS 4, use that:

http://developer.arsdigita.com/doc/kernel-doc.html

It is currently being ported by OpenACS to support both
PostgreSQL and Oracle (probably SAP DB, now also free, next):

http://www.openacs.org

This has a "query extractor" (in python) that collects all
the SQL used by the entire comprehensive ACS 4 Tcl web
application framework into a python and XML accessible form
where it can be conveniently be used for porting to other
DBMS and also for porting to other languages such as python.

This makes sense for the same reasons that building ZODB
with full support for only half of ACID made sense to meet
immediate requirements and lay a foundation for others
that are less urgent.

Once a *specific* OR mapping has been implemented, *then*
go on to do a more generic one.