[ZODB-Dev] RFC: Blobs in S3
Jim Fulton
jim at zope.com
Thu Jul 7 10:06:19 EDT 2011
Gaaaa, I sent this before I was done. Had some sort of gmail missfire,
I thought the email was lost. :/
On Wed, Jul 6, 2011 at 2:44 PM, Jim Fulton <jim at zope.com> wrote:
> We're evaluating AWS for some of our applications and I'm thinking of adding
> some options to support using S3 to store Blobs:
>
> 1. Allow a storage in a ZEO storage server to store Blobs in S3.
> This would probably be through some sort of abstraction to make
> this not actually depend on S3. It would likely leverage the fact that
> a storage server's interaction with blobs is more limited than application
> code.
>
> 2. Extend blob objects to provide an optional URL to fetch data
> from. This would allow applications to provide S3 (or similar service)
> URLs for blobs, rather than serving blob data themselves.
>
>
> 2.1 If I did this I think I'd also add a blob size property, so you could
> get a blob's size without opening the blob file or downloading
> it from a database server.
>
> Option 3. Handle blob URLs at the application level.
>
> To make this work for the S3 case, I think we'd have to use a
> ZEO server connection to be called by application code. Something like:
>
> self.blob = ZODB.blob.Blob()
> f = self.blob.open('w')
> f.write(some_data)
>
>
> Option 1 is fairly straightforward, and low risk.
>
> Option 2 is much trickier:
>
> - It's an API change
> - There are bits of implementation that depend on the
> current blob record format. I'm not sure if these
> bits extend beyond the ZODB code base.
> - The handling of blob object state would be a little
> delicate, since some of the state would be set on the storage
> server.
> - The win depends on being able to load a blob
> file independently of loading blob objects, although
> the ZEO blob cache implementation already depends
> on this.
Before I accidentally sent this, I was going to mention a 3rd option
involving ZEO extension methods. The idea being that you'd do
something like:
self.blob = ZODB.blob.Blob()
f = self.blob.open()
f.write(some_data)
f.close()
transaction.commit()
self.url = self._p_jar.db().storage.get_blob_url(self.blob._p_oid)
transaction.commit()
This is less risky, from an API point of view, but is messy in a
number of ways.
Jim
--
Jim Fulton
http://www.linkedin.com/in/jimfulton
More information about the ZODB-Dev
mailing list