[Grok-dev] Notes from working on export/reimport of data
Sebastian Ware
sebastian at urbantalk.se
Mon Aug 25 05:20:38 EDT 2008
I just want to write a short note of my experience from working with
export/reimport of data with Grok/Zodb. As you might know, I am not a
stellar developer, but what I did might help someone or maybe inspire
someone to come up with a better solution.
My problem was that I needed a solution that allowed me to move data
between two versions of my application. One production version and a
refactored (moved modules etc.) developer version. In a realtional
database, you can export a table to a tab separated file and import it
to any other table. This is the flexibility I wanted to achieve.
For the simple solution (export/reimport to compatible versions of an
application, without any data manipulation) I found that pickling
worked if one overrides the __parent__ attribute in the root object
(as returned by __getstate__()). This solution was also used by Kevin
Smith (I believe), however It probably has drawbacks for normal
operations.
def __getstate__(self):
state = dict(self.__dict__)
state['__parent__'] = None
return state
When it comes to moving data between refactored versions of my
application, I need to instantiate the object as a new class. I have
earlier been hinted about how to move modules by means of module
aliases, but I was stumbling a bit so I chose an approach that I found
easier to get my head around.
Instead of storing a pickle of all the objects, I stored a pickle of a
list of dictionaries describing the data in the objects. Since the
attributes can contain objects I substituted these references for a
tag <objectref id="###"> and used a file specific reference that
allows me to substitute this string with the recreated object during
import. Something a bit like this:
object['type'] = type(obj)
object['id'] = counter # file specific object id
object['model'] = list(obj.__dict__.items()) # but without the
__parent__ attribute
object['container'] = list(obj.items())
object['annotations'] = list(obj.__annotations__.items())
objList.append(object)
During import, I create a new object using the type identifier and an
"if this then that" kind of selector... (I guess this could be done
using adaptors, but I wanted a lowtech solution that just works first).
One of the problems I ran into was how to handle containers. I can't
add stuff to the containers until the object has been added. So, I
need to fire a
grok.notify(grok.ObjectAddedEvent(obj))
for each object and store the contents of the container in a temporary
attribute. This attribute is read by the code triggered by the event
and is used to update the container.
When it comes to annotations, I use a bit of custom code in order to
recreate them. I have used "hurry.workflow" so instead of just adding
the annotations I called the setState() of hurry workflow using the
state parameter stored in the annotations dictionary.
All in all, I have found this to be a painfull (but quite interesting)
experience and I hope the data export/import story is improved. I
think this is important because nobody should have to pay for a CMS/
webbapp where you can't easily export your data in case the database
gets corrupted and/or you need to upgrade the application.
Mvh Sebastian
More information about the Grok-dev
mailing list