[Zope-dev] Re: RDF Musings and TinyTables
Paul Everitt
paul@eurozope.org
Thu, 20 Feb 2003 23:14:55 +0100
On jeudi, f=E9v 20, 2003, at 22:15 Europe/Paris, Shane Hathaway wrote:
[snip]
> With all this in mind, I just studied my Mozilla mimeTypes.rdf file=20
> again. At first, this file looks nasty. I've only defined handlers=20=
> for two mime types, application/pdf and application/x-zope-edit, yet=20=
> the string "application/pdf" shows up 8 times in the file! I only=20
> typed it once. ;-)
Good news, though, it's *really* compressible. :^) gzip can get 20-1.
> But if I think of RDF files as database export files (or maybe the=20
> results of a database query), it all makes sense.
I hate to say it, but I can actually read RDF now.
> - The order in which the RDF elements appear in the file doesn't=20
> matter, just like the physical order of inodes on a hard disk doesn't=20=
> matter.
With Mozilla, it goes one step further. A graph can have multiple=20
datasources, which inject data into the graph. These datasources can=20
get resources from different servers, from different kinds of content=20
(IMAP, bookmarks, etc.)
Thus, not only does physical order not matter, but location doesn't=20
matter either.
> - The obvious way to read this file is to search for XML elements that=20=
> look like '<RDF:Description about=3D"urn:mimetype:*">'. But that's =
not=20
> the right way: that's like scanning filesystem inodes sequentially.=20
> Instead, there is a root URI, "urn:mimetypes", and the RDF elements=20
> make connections to other elements from there.
That's right. The URN structure was a trap I fell into until very=20
recently. I thought I would order my universe using a URN hierarchy. =20=
But I realized that it had no meaning and no use. When I next=20
refactor, I'll move to a flat model of=20
"urn:x-moztop:realmid:resourceid", where the id's are immutable SHA=20
calculations.
As an aside: Zope 3 should have an immutable, placeless object=20
identifier, but I lost that debate on #zope3-dev pretty badly. ^)
The ids should have no meaning. All the meaning should go in the RDF=20
properties, so you can do something with it.
This is another hidden meaning in RDF: properties are first-class=20
resources, in addition to property values.
Lately I've been thinking more about distributed content management and=20=
mobile content management, so these kinds of things are more important=20=
for me. When you gather up a bunch of content from a bunch of=20
loosely-coupled places, how do you make sense of it? If you have a=20
document on your laptop and on your website, should they be considered=20=
the same logical document?
> - RDF is hard to read, but legibility by humans isn't its primary=20
> focus. It's more concerned with providing a way to declare any=20
> relationship about anything.
Right. That's what the graph tool at the W3C online validator is for.=20=
:^) Just throw it some RDF and let it draw a picture for you.
>> The ad-hoc part is, for me, the key. Relational theory provided the=20=
>> theoretical foundation for modern online transaction processing. But=20=
>> things like content management are a much different problem. (One=20
>> analyst states that unstructured content is 80% of the information in=20=
>> a business.)
>> RDF, in my view, is the equivalent of a "set theory", a formal=20
>> foundation, for content management. Without it, everyone has to=20
>> build their own "framework" for stitching things together, for=20
>> connecting the dots.
>
> So RDF seems like a replacement for, or maybe enhancement of,=20
> relational theory. But I wonder how object-oriented databases fit in=20=
> the mix?
Good point. IMO, classic OODBMS want you to know more in advance than=20=
RDF. Also, the relationships are programmed, not assembled (perhaps=20
that isn't clearly stated).
>>> Serialization of RDF into XML and the relationship between RDF and=20=
>>> the Semantic Web are distinct concepts from RDF theory.
>> That's right. I've always been surprised when I threw some RDF/XML=20=
>> into Mozilla, then got a dump of the serialized results. What I put=20=
>> in doesn't look like what I get out. That's because there is an=20
>> abstract model. The XML can look a couple of different ways, and you=20=
>> still have the same abstract model.
>
> How do you (1) throw RDF into Mozilla and (2) get a dump of the=20
> results? Is there a utility for doing this?
Yes. There's the hard way and the easy way. For the hard way, you use=20=
XPCOM to grab the datasource, get a component to serialize it, and run=20=
some methods on it. (I say "hard way", really, it's probably 5 lines=20
of JS. Long lines.)
However, I'm using rdfds from XulPlanet (which is the best=20
documentation site for any project I've ever seen):
http://www.xulplanet.com/tutorials/xultu/rdfds/
For a very quick and useful RDF introduction, read chapter six of the=20
XUL tutorial:
http://www.xulplanet.com/tutorials/xultu/
With rdfds, getting a serialized version is simple:
var ds =3D new RDFDataSource("http://www.zope.org/some.rdf");
alert(ds.serializeToString());
> Are the results in RDF, and are they pretty much equivalent to "cat=20
> file1 file2"? :-)
The result of serializeToString is indeed RDF, and it is not even close=20=
to cat file1 file2. :^) Logically it is exactly the same. The string=20=
itself, though, will look quite different than the two input files. =20
The serializer can make up some URNs for anonymous resources, it can=20
give new namespace prefixes to namespaces you declare, etc.
More important, it likes to rearrange...
Wait, that's more detail than you probably want. :^)
>> It took me a while, but I learned how to take advantage of this. =20
>> With Moztop, I'm taking a pretty loose, distributed approach to=20
>> content managment. I collect RDF from a bunch of different servers,=20=
>> throw it all into one big graph, and use this to draw widgets on the=20=
>> screen.
>> The ability to make an assertion into a completely different part of=20=
>> the tree is something you can't do in XML.
>>> This ad-hoc data storage made me think of TinyTables. TinyTables is=20=
>>> a good Zope product that fills the need for simple tables of data,=20=
>>> but it needs attention. What if it got replaced by some Zope=20
>>> product called
>> I will do everything in the universe to help such a project. How is=20=
>> that? :^)
>> I know what the practical benefits that RDF can mean for content=20
>> management. And it isn't esoteric Semantic Gibberish. I'm unable,=20=
>> though, to map it on the server side. However, I'm having luck on=20
>> the client side:
>> http://www.zope-europe.org/Members/paul/tmp/moztop-pinstripe.png
>
> I can see that the benefits on the client side would be enormous. For=20=
> interfacing clients to Zope, we've always thought in two directions:=20=
> either connect the client via ZEO, or have the client call remote=20
> procedures that return lists and strings. The ZEO client idea would=20=
> be fast and easy, but the client would get unrestricted access to the=20=
> whole database. The remote procedures would be secure but potentially=20=
> slow, since the client usually needs more than one list or string.
Right.
> But if the client requests RDF using a remote procedure call, the=20
> server can send back everything at once that it considers relevant. =20=
> Hmmm... but I bet there's more to it.
Yes, but it's a good "more".
Right now I have a composite datasource (one that gets its data from=20
multiple locations). It is getting fed by RDF from a CMF (where a page=20=
template serializes the entire portal_catalog) and from a Zope 3. =20
Later I'll add a file system source. All of these generate resources=20
and relationships that get thrown into a bucket.
I use a couple of RDF properties to hold things together: <nc:subitems>=20=
to indicated tree containment, <dc:title> and <site:resourcetype> (site=20=
is my own namespace) to provide connections into labels for the UI, and=20=
then CSS styling for icons and whatnot.
> Here's a fantasy... the ability to write a template that can be=20
> processed either by Zope or by the client. When the client is able to=20=
> do the work, send the template and bunch of RDF. When the client=20
> can't do it, preprocess it. This is what XSLT always wanted to be=20
> able to do, but I couldn't see it getting there. Maybe RDF can make=20=
> this a reality? :-)
If all you really want is to take data and draw stuff on the screen,=20
XSLT can do it. However, XSLT, and I'm convinced the XML underneath,=20
doesn't really help construct a complete interface. And certainly not=20=
one that is based on a rich content model.
> This email is getting big, so I'll cut it off here for now. I'll=20
> study the XUL templates.
Cool. Drop by #moztop sometime and say hi.
--Paul=