Re: [Zope-dev] RFC: RelationAware class for relations betweenobjects

1 May 2003

      * Shane Hathaway <shane@zope.com> [2003-05-01 19:36]:
...
Roché Compaan wrote:
...
* Shane Hathaway <shane@zope.com> [2003-04-30 16:55]:
...
Do you agree with these requirements and minimal use cases?  Is there 
anything I need to clarify?
This sounds good. Now what is the next step? I'd like to help build on
the API or aspects of it like making it work in Zope 2 for instance. I
like the idea of having RelationShips as a separate branch in the ZODB
that you posted earlier.
Well, I think we should look at the way relationships get stored.  Right 
now I can think of two possible models:
- Store and index relationship objects.  This is perhaps the most 
obvious way to do it.
- Use relations (database tables) and infer relationships rather than 
store them.  This could allow us to take advantage of relational theory.
The advantage of storing relationship objects directly is that it's 
obvious.  When you want to relate object A to object B, you just create 
a Relationship object that links A and B and store it.  You can store 
extra metadata by creating your own relationship classes.  Relationship 
objects need only implement a minimal interface.
The disadvantage of storing relationship objects is that every 
relationship is stored as a separate object.  But in a complex 
application, aren't there potentially far too many relationships to 
enumerate?  How would we manage this?  And would it be possible to infer 
relationships?
When you use database tables you have the same problem, but it's not a
problem IMHO. It is common practice to have tables that only store
relationships - lots of them. In my db design for student registrations
I would have the following tables:

    Student             StudentCourses           Course
    ______________      --------------------     ---------------
    ID  | Name          StudentID | CourseID     ID  | Name

    Term                TermCourses
    --------------      --------------------
    ID  | Name          TermID    | CourseID

IMO this is the same as having 3 relation objects in which object
relationships are stored. I don't think this is difficult to manage
either - mxmRelations does this pretty well. For the above I will have 
3 BTrees which should be able to store many objects and scale well.
There is also no need to index them - you can do lookups directly on the
relation.
...
The advantage of using relations is that it gives us the benefits of 
relational theory.  Relational theory provides a clear way to store and 
query relationships efficiently.  It lets you infer relationships based 
on explicit relationships.
I *think* we are already applying some relational theory by:

    - Creating a relation object and storing relationships in it (vs a
      normalised table representing the relation that stores
      relationships as records)

    - We use unique ids/paths to relate objects and do not duplicate
      information about the objects in the relationship.

    - We normalise classes just like we normalise tables.

    - We can still infer relationships (just like sql views)

So we don't have to give up the benefits ;-) To illustrate I'll compare
a sql join on the db above with the python way to infer the relation of
registrations for the current term.

SQL:
    select
        Student.Name, Course.Name
    from
        Student, Course, Term, 
        StudentCourses, TermCourses
    where
        Student.ID = StudentCourses.StudentID and
        Course.ID  = StudentCourses.CourseID and
        Course.ID  = TermCourses.CourseID and
        Term.ID    = TermCourses.TermID and
        Term.ID    = CurrentTermID;

Python:
    courses  = term_courses.get(current_term);
    students = student_courses.get(courses);
    # by now we already have the objects we want
    # but lets display the result too.
    for student in students:
        print student.Name
        for course in student_courses.get(student):
            print course.Name
...
The disadvantage of using relations is that relationships have to be 
decomposed for storage.  You can't just add an attribute to a 
Relationship object and expect it to be stored; you also have to arrange 
for the corresponding relation to store another column.  (Although you 
might automate the process by telling a relation to keep its schema in 
sync with a particular class.)
Either solution provides a way to store and retrieve simple 
relationships.  The difference is in the way they can expand.  I like to 
imagine that the relational model was developed for the purpose of 
scaling the entity relationship model, but that's a wild guess.
I suppose mxmRelations stores relationship objects directly.  A ZCatalog 
instance, on the other hand, is much like a relation, although it 
doesn't implement all the same operations and provides some extra 
operations.
To me a ZCatalog is much more like a index on a database table than a
relation in that it does not know both ends of a relationship.

I am not too worried about how we will store relationships. I will go
for the most obvious way: store relationship objects. Here the
mxmRelations API is a good starting point and one can expand the API to
store metadata as well. Am I correct in saying that Ape wouldn't need
modification either since a relationship object will be handled just
like any other object?

Ok, so far we can "list all grandmothers" ;-)

The difficult part in my mind is to give objects insight into their
relationships. I think this issue is separate from *storing*
relationships. The implementation might be tricky, but the requirements
are simple: 

    - One should be able to ask an object what relationships it has. A
      Person object might answer: "With my employer, my wife and
      friends".

        class Person:
            Employer = VoodooRelationshipThingy()
            MyWife   = VoodooRelationshipThingy()
            Friends  = VoodooRelationshipThingy()

        or in the case where Person is an object in another Product that
        I don't want mess with (hopefully subclassing is not the only
        way to do this):

        class MyPerson(Person):
            Employer = VoodooRelationshipThingy()
            MyWife = VoodooRelationshipThingy()
            Friends  = VoodooRelationshipThingy()

    - Relationships should be accessible as attributes. This makes it
      possible to ask the Person object what the name of its employer
      is, tell it that it has a new employer, and a new friend.

        pete = Person()
        assert pete.Employer.Name = "SomeCompany"

        # Pete has a new employer
        pete.Employer = TheOtherCompany
        assert pete.Employer.Name = "TheOtherCompany"

        assert pete.Friends.John.Surname = "Smith"
        # Pete has a new friend
        pete.Friends.add(mary)

Something like ComputedAttribute or descriptors should make it possible.
Hmm, I might just have thought of a way to do this with
ComputedAttribute which I'll try tomorrow. But ComputedAttribute is
Zope2 specific isn't it? Darn ...

It's quite late now so I hope I made sense.

-- 
Roché Compaan
Upfront Systems                 http://www.upfrontsystems.co.za