* Shane Hathaway <shane@zope.com> [2003-05-01 19:36]:
Roché Compaan wrote:
* Shane Hathaway <shane@zope.com> [2003-04-30 16:55]:
Do you agree with these requirements and minimal use cases? Is there anything I need to clarify?
This sounds good. Now what is the next step? I'd like to help build on the API or aspects of it like making it work in Zope 2 for instance. I like the idea of having RelationShips as a separate branch in the ZODB that you posted earlier.
Well, I think we should look at the way relationships get stored. Right now I can think of two possible models:
- Store and index relationship objects. This is perhaps the most obvious way to do it.
- Use relations (database tables) and infer relationships rather than store them. This could allow us to take advantage of relational theory.
The advantage of storing relationship objects directly is that it's obvious. When you want to relate object A to object B, you just create a Relationship object that links A and B and store it. You can store extra metadata by creating your own relationship classes. Relationship objects need only implement a minimal interface.
The disadvantage of storing relationship objects is that every relationship is stored as a separate object. But in a complex application, aren't there potentially far too many relationships to enumerate? How would we manage this? And would it be possible to infer relationships?
When you use database tables you have the same problem, but it's not a problem IMHO. It is common practice to have tables that only store relationships - lots of them. In my db design for student registrations I would have the following tables: Student StudentCourses Course ______________ -------------------- --------------- ID | Name StudentID | CourseID ID | Name Term TermCourses -------------- -------------------- ID | Name TermID | CourseID IMO this is the same as having 3 relation objects in which object relationships are stored. I don't think this is difficult to manage either - mxmRelations does this pretty well. For the above I will have 3 BTrees which should be able to store many objects and scale well. There is also no need to index them - you can do lookups directly on the relation.
The advantage of using relations is that it gives us the benefits of relational theory. Relational theory provides a clear way to store and query relationships efficiently. It lets you infer relationships based on explicit relationships.
I *think* we are already applying some relational theory by: - Creating a relation object and storing relationships in it (vs a normalised table representing the relation that stores relationships as records) - We use unique ids/paths to relate objects and do not duplicate information about the objects in the relationship. - We normalise classes just like we normalise tables. - We can still infer relationships (just like sql views) So we don't have to give up the benefits ;-) To illustrate I'll compare a sql join on the db above with the python way to infer the relation of registrations for the current term. SQL: select Student.Name, Course.Name from Student, Course, Term, StudentCourses, TermCourses where Student.ID = StudentCourses.StudentID and Course.ID = StudentCourses.CourseID and Course.ID = TermCourses.CourseID and Term.ID = TermCourses.TermID and Term.ID = CurrentTermID; Python: courses = term_courses.get(current_term); students = student_courses.get(courses); # by now we already have the objects we want # but lets display the result too. for student in students: print student.Name for course in student_courses.get(student): print course.Name
The disadvantage of using relations is that relationships have to be decomposed for storage. You can't just add an attribute to a Relationship object and expect it to be stored; you also have to arrange for the corresponding relation to store another column. (Although you might automate the process by telling a relation to keep its schema in sync with a particular class.)
Either solution provides a way to store and retrieve simple relationships. The difference is in the way they can expand. I like to imagine that the relational model was developed for the purpose of scaling the entity relationship model, but that's a wild guess.
I suppose mxmRelations stores relationship objects directly. A ZCatalog instance, on the other hand, is much like a relation, although it doesn't implement all the same operations and provides some extra operations.
To me a ZCatalog is much more like a index on a database table than a relation in that it does not know both ends of a relationship. I am not too worried about how we will store relationships. I will go for the most obvious way: store relationship objects. Here the mxmRelations API is a good starting point and one can expand the API to store metadata as well. Am I correct in saying that Ape wouldn't need modification either since a relationship object will be handled just like any other object? Ok, so far we can "list all grandmothers" ;-) The difficult part in my mind is to give objects insight into their relationships. I think this issue is separate from *storing* relationships. The implementation might be tricky, but the requirements are simple: - One should be able to ask an object what relationships it has. A Person object might answer: "With my employer, my wife and friends". class Person: Employer = VoodooRelationshipThingy() MyWife = VoodooRelationshipThingy() Friends = VoodooRelationshipThingy() or in the case where Person is an object in another Product that I don't want mess with (hopefully subclassing is not the only way to do this): class MyPerson(Person): Employer = VoodooRelationshipThingy() MyWife = VoodooRelationshipThingy() Friends = VoodooRelationshipThingy() - Relationships should be accessible as attributes. This makes it possible to ask the Person object what the name of its employer is, tell it that it has a new employer, and a new friend. pete = Person() assert pete.Employer.Name = "SomeCompany" # Pete has a new employer pete.Employer = TheOtherCompany assert pete.Employer.Name = "TheOtherCompany" assert pete.Friends.John.Surname = "Smith" # Pete has a new friend pete.Friends.add(mary) Something like ComputedAttribute or descriptors should make it possible. Hmm, I might just have thought of a way to do this with ComputedAttribute which I'll try tomorrow. But ComputedAttribute is Zope2 specific isn't it? Darn ... It's quite late now so I hope I made sense. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za