[Zope-dev] RFC: RelationAware class for relations betweenobjects
Roché Compaan
roche@upfrontsystems.co.za
Thu, 1 May 2003 23:30:04 +0200
* Shane Hathaway <shane@zope.com> [2003-05-01 19:36]:
> Roché Compaan wrote:
> >* Shane Hathaway <shane@zope.com> [2003-04-30 16:55]:
> >
> >>Do you agree with these requirements and minimal use cases? Is there
> >>anything I need to clarify?
> >
> >
> >This sounds good. Now what is the next step? I'd like to help build on
> >the API or aspects of it like making it work in Zope 2 for instance. I
> >like the idea of having RelationShips as a separate branch in the ZODB
> >that you posted earlier.
>
> Well, I think we should look at the way relationships get stored. Right
> now I can think of two possible models:
>
> - Store and index relationship objects. This is perhaps the most
> obvious way to do it.
>
> - Use relations (database tables) and infer relationships rather than
> store them. This could allow us to take advantage of relational theory.
>
> The advantage of storing relationship objects directly is that it's
> obvious. When you want to relate object A to object B, you just create
> a Relationship object that links A and B and store it. You can store
> extra metadata by creating your own relationship classes. Relationship
> objects need only implement a minimal interface.
>
> The disadvantage of storing relationship objects is that every
> relationship is stored as a separate object. But in a complex
> application, aren't there potentially far too many relationships to
> enumerate? How would we manage this? And would it be possible to infer
> relationships?
When you use database tables you have the same problem, but it's not a
problem IMHO. It is common practice to have tables that only store
relationships - lots of them. In my db design for student registrations
I would have the following tables:
Student StudentCourses Course
______________ -------------------- ---------------
ID | Name StudentID | CourseID ID | Name
Term TermCourses
-------------- --------------------
ID | Name TermID | CourseID
IMO this is the same as having 3 relation objects in which object
relationships are stored. I don't think this is difficult to manage
either - mxmRelations does this pretty well. For the above I will have
3 BTrees which should be able to store many objects and scale well.
There is also no need to index them - you can do lookups directly on the
relation.
> The advantage of using relations is that it gives us the benefits of
> relational theory. Relational theory provides a clear way to store and
> query relationships efficiently. It lets you infer relationships based
> on explicit relationships.
I *think* we are already applying some relational theory by:
- Creating a relation object and storing relationships in it (vs a
normalised table representing the relation that stores
relationships as records)
- We use unique ids/paths to relate objects and do not duplicate
information about the objects in the relationship.
- We normalise classes just like we normalise tables.
- We can still infer relationships (just like sql views)
So we don't have to give up the benefits ;-) To illustrate I'll compare
a sql join on the db above with the python way to infer the relation of
registrations for the current term.
SQL:
select
Student.Name, Course.Name
from
Student, Course, Term,
StudentCourses, TermCourses
where
Student.ID = StudentCourses.StudentID and
Course.ID = StudentCourses.CourseID and
Course.ID = TermCourses.CourseID and
Term.ID = TermCourses.TermID and
Term.ID = CurrentTermID;
Python:
courses = term_courses.get(current_term);
students = student_courses.get(courses);
# by now we already have the objects we want
# but lets display the result too.
for student in students:
print student.Name
for course in student_courses.get(student):
print course.Name
> The disadvantage of using relations is that relationships have to be
> decomposed for storage. You can't just add an attribute to a
> Relationship object and expect it to be stored; you also have to arrange
> for the corresponding relation to store another column. (Although you
> might automate the process by telling a relation to keep its schema in
> sync with a particular class.)
>
> Either solution provides a way to store and retrieve simple
> relationships. The difference is in the way they can expand. I like to
> imagine that the relational model was developed for the purpose of
> scaling the entity relationship model, but that's a wild guess.
>
> I suppose mxmRelations stores relationship objects directly. A ZCatalog
> instance, on the other hand, is much like a relation, although it
> doesn't implement all the same operations and provides some extra
> operations.
To me a ZCatalog is much more like a index on a database table than a
relation in that it does not know both ends of a relationship.
I am not too worried about how we will store relationships. I will go
for the most obvious way: store relationship objects. Here the
mxmRelations API is a good starting point and one can expand the API to
store metadata as well. Am I correct in saying that Ape wouldn't need
modification either since a relationship object will be handled just
like any other object?
Ok, so far we can "list all grandmothers" ;-)
The difficult part in my mind is to give objects insight into their
relationships. I think this issue is separate from *storing*
relationships. The implementation might be tricky, but the requirements
are simple:
- One should be able to ask an object what relationships it has. A
Person object might answer: "With my employer, my wife and
friends".
class Person:
Employer = VoodooRelationshipThingy()
MyWife = VoodooRelationshipThingy()
Friends = VoodooRelationshipThingy()
or in the case where Person is an object in another Product that
I don't want mess with (hopefully subclassing is not the only
way to do this):
class MyPerson(Person):
Employer = VoodooRelationshipThingy()
MyWife = VoodooRelationshipThingy()
Friends = VoodooRelationshipThingy()
- Relationships should be accessible as attributes. This makes it
possible to ask the Person object what the name of its employer
is, tell it that it has a new employer, and a new friend.
pete = Person()
assert pete.Employer.Name = "SomeCompany"
# Pete has a new employer
pete.Employer = TheOtherCompany
assert pete.Employer.Name = "TheOtherCompany"
assert pete.Friends.John.Surname = "Smith"
# Pete has a new friend
pete.Friends.add(mary)
Something like ComputedAttribute or descriptors should make it possible.
Hmm, I might just have thought of a way to do this with
ComputedAttribute which I'll try tomorrow. But ComputedAttribute is
Zope2 specific isn't it? Darn ...
It's quite late now so I hope I made sense.
--
Roché Compaan
Upfront Systems http://www.upfrontsystems.co.za