[ZODB-Dev] Spatial indices
Nitro
nitro at dr-code.org
Wed Jun 16 11:45:35 EDT 2010
Hello,
I tried to find a spatial index which integrates seamlessly into the ZODB.
Unfortunately I did not find a satisfying solution anywhere. So I came up
with three solutions how this could be implemented:
1) Write a native r-tree package, just like the current BTrees. Would
likely have to be written in C for performance.
2) Make use of the existing B+ Trees by using a space filling curve such
as the Z-curve or Hilbert curve to transform higher-dimensional data into
1D data which can then be stored in a BTree. Since B+ trees also provide
range querying capabilities this should give good query performance.
Unsure how much speed-up a C implementation of the insert/query functions
would give. More info:
http://www.scholarpedia.org/article/B-tree_and_UB-tree and
http://www.drdobbs.com/184410998 .
3) Use the already existing Rtree package from
http://pypi.python.org/pypi/Rtree . It's a thin wrapper of a C library, so
it should be very fast. I can see two methods to make this work:
3a)
- Create an rtree.RTree (stored in a separate file) and an OOTreeSet.
- inserting: insert item into BTree. Then insert item's oid into Rtree.
- querying: user supplies bounding box, rtree is queried, oids are
returned. look up objects by oid in BTree.
- zeo: does not work out-of-the-box with zeo since the Rtrees on different
machines are not synchronized.
3b)
- Create an rtree.RTree, a OOTreeSet and an IOTree. Difference to 3a):
Create RTree with a custom storage manager (example:
http://svn.gispython.org/svn/spatialindex/spatialindex/trunk/src/storagemanager/MemoryStorageManager.h
and
http://trac.gispython.org/spatialindex/browser/spatialindex/trunk/README#storage-manager
). This storage manager stores each page into the IOTree (key: pageId,
value: pageData).
- inserting: insert item into BTree. Then insert item's oid into Rtree.
Causes storage manager to write out changed rtree pages to IOTree.
- querying: user supplies bounding box, rtree is queried, pages for rtree
returned from IOTree, oids finally returned from query. look up objects by
oid in BTree.
- zeo: works out-of-the-box with zeo since the rtree pulls its data from a
btree (which is hooked up with zeo).
Conclusion:
1) Native r-tree package: It is a lot of work which has already been done
before. Bug-prone. Ruled out.
2) Spatial index on top of current BTrees: Looks interesting, could be
done in python. Disadvantages: unclear UB tree patent situation, unclear
how much work this really is.
3a) Does not work with zeo out-of-the-box. Ruled out.
3b) Requires writing a custom storage manager for the rtree package
(likely in C). Provides different trees. Basic technology (rtrees +
btrees) is tested.
Would it make sense to add a default spatial index to ZODB? Does anybody
of you have any experience with one of the mentioned solutions? Is anybody
else interested in having a zodb spatial index?
-Matthias
More information about the ZODB-Dev
mailing list