ZODB performance: reads to writes

newer
mozilla and zope play nicely again!

Jimmie Houchin

24 Jun 2000 24 Jun '00

3:11 p.m.

I know this has been brought up before, but I don't remember if this question has been asked. I've been reading everything I could find on the ZODB. Currently I am reading 'Introduction to the Zope Object Database' by Jim Fulton at http://www.zope.org/Members/jim/Info/IPC8/ZODB3/index.html . Towards the end of the paper under Status, he starts referring to future features. Under 'Application-level conflict resolution protocols' he states this, 'Applications with much higher write to read ratios are likely to encounter frequent conflict errors which can seriously affect performance.' Is this pretty much the primary reason that it is generally said that the ZODB isn't as well suited to high write situations? Or is there more to it than that? The reason I ask is this. My app by it's nature as a community/portal will have plenty of writes in certain areas. However, due to the structure of the website and the app most all writes are to unique objects, are appends to an object or the person editing the object is the owner and has no one to conflict with. In this case there should be little if any ConflictErrors due to different users trying to commit changes to the same object. In the case of the appends, it would be like a bboard and there would be no conflict. Appends will be done in the order received. Will an app as described above still suffer from problems with high writes? Thanks for any help. Jimmie Houchin

Show replies by date

Evan Simpson

24 Jun 24 Jun

4:43 p.m.

New subject: [Zope] ZODB performance: reads to writes

----- Original Message ----- From: Jimmie Houchin <jhouchin@texoma.net>

...

Will an app as described above still suffer from problems with high writes?

Possibly, but only if there are hidden hotspots. For example, in your message-appending scenario, are these messages being added to the same Folder? If so, the Folder is getting written with each object added to it, and will be a source of conflict. If the objects that your users are editing are cataloged, the Catalog is a hotspot. There are two independent attacks on this problem underway: 1. Make Folders and Catalogs store meta-data about their contents in a data structure consisting of small persistent objects, like B-Tree nodes. This reduces the scope of potential conflict (and the size of the update required by a write) to the size of one of these nodes. 2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update. Cheers, Evan @ digicool & 4-am

Jimmie Houchin

6:21 p.m.

New subject: [Zope] ZODB performance: reads to writes

Thanks for the reply. This is what I understand based on your reply and from the paper by Jim. 1. That there are solutions currently being worked on by DC (implied). Yes, worked on does not mean 'Coming Soon to a Zope near You!' :) 2. That if an app, either by it's nature or thru it's developers design, eliminates or handles conflicts before commits are made to the ZODB, that high write situations are not a problem. Slap me upside the head if I misunderstand. :) If this is the case then I think this is great news. Okay, it may not be news. And I may be the only clueless soul in ZopeLand. :) Question: Is number 1. below something that would take place with BTree Folders? Or is this apples and oranges? Once again, thanks. Jimmie Houchin Evan Simpson wrote:

...

----- Original Message ----- From: Jimmie Houchin <jhouchin@texoma.net>

...
Will an app as described above still suffer from problems with high writes?

Possibly, but only if there are hidden hotspots. For example, in your message-appending scenario, are these messages being added to the same Folder? If so, the Folder is getting written with each object added to it, and will be a source of conflict. If the objects that your users are editing are cataloged, the Catalog is a hotspot.

There are two independent attacks on this problem underway:

1. Make Folders and Catalogs store meta-data about their contents in a data structure consisting of small persistent objects, like B-Tree nodes. This reduces the scope of potential conflict (and the size of the update required by a write) to the size of one of these nodes.

2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update.

Cheers,

Evan @ digicool & 4-am

_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )

Evan Simpson

9:14 p.m.

New subject: [Zope] ZODB performance: reads to writes

----- Original Message ----- From: Jimmie Houchin <jhouchin@texoma.net>

...

This is what I understand based on your reply and from the paper by Jim.

1. That there are solutions currently being worked on by DC (implied). Yes, worked on does not mean 'Coming Soon to a Zope near You!' :)

2. That if an app, either by it's nature or thru it's developers design, eliminates or handles conflicts before commits are made to the ZODB, that high write situations are not a problem.

AFAIK, these are both correct.

...

Is number 1. below something that would take place with BTree Folders?

Yes. Cheers, Evan @ digicool & 4-am

Toby Dickenson

26 Jun 26 Jun

9:27 a.m.

New subject: [Zope] ZODB performance: reads to writes

On Sat, 24 Jun 2000 12:43:05 -0400, "Evan Simpson" <evan@digicool.com> wrote:

...

...
Will an app as described above still suffer from problems with high writes?

...

There are two independent attacks on this problem underway:

...

2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update.

Yes, that will help

...

1. Make Folders and Catalogs store meta-data about their contents in a data structure consisting of small persistent objects, like B-Tree nodes. This reduces the scope of potential conflict (and the size of the update required by a write) to the size of one of these nodes.

As I understand it, a BTreeFolder alone (ie without application-level conflict handling) will not help here. Folders have to ensure that all their contained elements have a different id. The hot-spot is the only way a Folder can achieve this. Toby Dickenson tdickenson@geminidataloggers.com

Jim Fulton

28 Jun 28 Jun

2:41 p.m.

New subject: [Zope] ZODB performance: reads to writes

Toby Dickenson wrote:

...

On Sat, 24 Jun 2000 12:43:05 -0400, "Evan Simpson" <evan@digicool.com> wrote:

...
...
Will an app as described above still suffer from problems with high writes?

...
There are two independent attacks on this problem underway:

...
2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update.

Yes, that will help

...
1. Make Folders and Catalogs store meta-data about their contents in a data structure consisting of small persistent objects, like B-Tree nodes. This reduces the scope of potential conflict (and the size of the update required by a write) to the size of one of these nodes.

As I understand it, a BTreeFolder alone (ie without application-level conflict handling) will not help here.

Sure they will, because a BTree is the moral equivalent of multiple subfolders. (This assumes that a problem in the current BTree design is fixed, which it will be. ;)

...

Folders have to ensure that all their contained elements have a different id. The hot-spot is the only way a Folder can achieve this.

But there is only a conflict if two transactions want to pick the same id. Going to (fixed) BTrees doesn't prevent *all* conflicts, but it does prevent most conflicts, which will be good enough for many applications. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.

tsarna＠endicor.com

27 Jun 27 Jun

10:37 p.m.

New subject: [Zope] ZODB performance: reads to writes

In article <000d01bfddfb$4546f070$3e48a4d8@digicool.com>, Evan Simpson <evan@digicool.com> wrote:

...

----- Original Message ----- From: Jimmie Houchin <jhouchin@texoma.net>

...
Will an app as described above still suffer from problems with high writes?

Possibly, but only if there are hidden hotspots. For example, in your [...]> 2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update.

Unfortunately, this doesn't deal with cases where the conflicting state is contained in many objects (see note by PJE in the ZODB Wiki). Also, there is a whole other area of difficulty for high-write-volume ZODBs, which is the ammount of IO that needs to be done. First, by nature ZODB can't rewrite a single attribute of an object, it has to rewrite the entire thing. Indexing is also a bear from an IO perspective. First, BTrees currently keep a count at each level, so every change to a btree changes a node at each level of the BTree. For a ZCatalog, there are a lot of btrees (something like 2n+4 for n indexes, I think -- don't quote me on that, it's been a while), and each one changes (last I looked, every index was updated even if the value indexed in a particular one hadn't changed. This may have been improved since). Not only is this bad from a hotspot point of view (always a conflict on the root node of the tree), but you end up doing a *lot* of IO. During my experiments that led to BerkeleyStorage, I was watching the Data.fs grow by 47K per transaction for adding indexed objects of ~1K in size. Watching this with tranalyzer, this turns out to be 1K of object, and 46K of updated btree pages :). Note that BerkeleyStorage only prevents the file from growing that much -- it still has to do all that IO (in fact, it has to do ~2-3 times that much IO, due to the nature of BerkeleyDB. A relational storage would have similar issues. For ammount of IO done, FileStorage is about as efficient as you can possibly be -- it's just that it trades that off against space reclamation). Also, with any kind of Berkeley or Relational storage, there is a second hidden IO and storage penalty: you're storing a btree inside a btree. In other words, the lower-level DB uses btrees to store your objects, including interior nodes of the higher-level ZODB btree. Every interior node of the ZODB Btree needs a leaf node (and supporting interior nodes) in the DB's btrees. so you get taxed twice, on both I/O and storage space used. Not to discourage anyone from using ZODB, necessarily. There are a lot of things it's fantastic for, and without a doubt ZODB is getting better at handling higher write ratios. Over time there will be more and more applications that previously would have required an external SQL or other kind of database that can be done in ZODB instead. However, there will also IMHO always be applications that ZODB just isn't as suitable for. You have to thing long and hard before committing to one or ther other. And then there's the worry of what happens if you chose wrong. We were faced with exactly these issues, and the extremes of them, to boot. We have a *large*, *very* high write ratio, lots of indexes type of application based on ZPublisher/DTML that we'd like to port to/replace with something Zope based. Yet we might need to make another instance of this same type of application used by only a few people with a small ammount of data -- it would really suck to have to have to have another instance of the same expensive database system to support a miniscule ammount of data, because everything was coded only with SQL in mind). This is what led ultimately to ZPatterns -- you can write applications and not have to decide up front on ZODB or SQL. And you can change your mind later (Seen that TV commercial? suddenly your online store is selling a zillion items per month instead of the 1000 you planned for. oops!). You can even decide on an instance by instance basis. You configure with ZODB for a small department or client, and Oracle or Sybase for a huge one -- and the small guy doesn't have to pay for the DB license and DBA!). Since then, we've discovered a number of other benefits to the model. Hmmm... I didn't intend to write a ZPatterns advertisement when I started, honest! But this seems to have turned into one nonetheless :^)

Jim Fulton

28 Jun 28 Jun

2:57 p.m.

New subject: [Zope] ZODB performance: reads to writes

Ty Sarna wrote:

...

In article <000d01bfddfb$4546f070$3e48a4d8@digicool.com>, Evan Simpson <evan@digicool.com> wrote:

...
----- Original Message ----- From: Jimmie Houchin <jhouchin@texoma.net>

...
Will an app as described above still suffer from problems with high writes?

Possibly, but only if there are hidden hotspots. For example, in your [...]> 2. Implement the application-level conflict handling you read about, so that Folders and Catalogs can decide that two writes don't conflict after all, and merge them into a single update.

Unfortunately, this doesn't deal with cases where the conflicting state is contained in many objects (see note by PJE in the ZODB Wiki).

Yes it does. (See my response to PJE's note.)

...

Also, there is a whole other area of difficulty for high-write-volume ZODBs, which is the ammount of IO that needs to be done. First, by nature ZODB can't rewrite a single attribute of an object, it has to rewrite the entire thing.

Each object (that subclasses Persistent) is analigous to a database record. When you modify a part of the object (that isn't it's own persistent object) then you write the entire record. This seems pretty reasonable to me. Part of ZODB database design, where it matters, is to balence the size of database objects. If objects are too big, then the amount of data written on a change is larger. If objects are too small, then you may incur too much persistence overhead. Most apps don't need this level of tuning.

...

Indexing is also a bear from an IO perspective. First, BTrees currently keep a count at each level, so every change to a btree changes a node at each level of the BTree. For a ZCatalog, there are a lot of btrees (something like 2n+4 for n indexes, I think -- don't quote me on that, it's been a while), and each one changes (last I looked, every index was updated even if the value indexed in a particular one hadn't changed. This may have been improved since). Not only is this bad from a hotspot point of view (always a conflict on the root node of the tree), but you end up doing a *lot* of IO. During my experiments that led to BerkeleyStorage, I was watching the Data.fs grow by 47K per transaction for adding indexed objects of ~1K in size. Watching this with tranalyzer, this turns out to be 1K of object, and 46K of updated btree pages :).

This is a significant problem. The current BTree implementation, which predates Principia, was designed for very different applications than it's being used for now. We are working on a new BTree implementation that does away with these counts. This should have a huge impact. We are also looking at getting rid of other hot spots in the current ZCatalog (e.g. internal id assignment).

...

Note that BerkeleyStorage only prevents the file from growing that much -- it still has to do all that IO (in fact, it has to do ~2-3 times that much IO, due to the nature of BerkeleyDB. A relational storage would have similar issues. For ammount of IO done, FileStorage is about as efficient as you can possibly be -- it's just that it trades that off against space reclamation).

Also, with any kind of Berkeley or Relational storage, there is a second hidden IO and storage penalty: you're storing a btree inside a btree. In other words, the lower-level DB uses btrees to store your objects, including interior nodes of the higher-level ZODB btree. Every interior node of the ZODB Btree needs a leaf node (and supporting interior nodes) in the DB's btrees. so you get taxed twice, on both I/O and storage space used.

I don't agree with the conclusion of this analysis. The indexes used in the underlying storage are indexing totally different information. They are effectively using indexes to provide persistent memory management. They aren't indexing the application keys. OTOH, I have some sympathy with a related issue. You and Phillip have argued that the ZODB should provide indexes, rather than leaving indexes to application level code to avoid maintaining undo information for indexes. After all, indexes can, in theory, be recomputed from data records after an undo. While I think that this idea has some merit, I don't think it offers enogh benefit to make it a high priority. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.

Jim Fulton

2:36 p.m.

New subject: [Zope] ZODB performance: reads to writes

Jimmie Houchin wrote:

...

I know this has been brought up before, but I don't remember if this question has been asked.

I've been reading everything I could find on the ZODB. Currently I am reading 'Introduction to the Zope Object Database' by Jim Fulton at http://www.zope.org/Members/jim/Info/IPC8/ZODB3/index.html .

Towards the end of the paper under Status, he starts referring to future features. Under 'Application-level conflict resolution protocols' he states this, 'Applications with much higher write to read ratios are likely to encounter frequent conflict errors which can seriously affect performance.'

Is this pretty much the primary reason that it is generally said that the ZODB isn't as well suited to high write situations?

Yes. ZODB uses an optimistic concurrency control protocol, which assumes that conflicts are rare.

...

Or is there more to it than that?

The reason I ask is this.

My app by it's nature as a community/portal will have plenty of writes in certain areas. However, due to the structure of the website and the app most all writes are to unique objects, are appends to an object or the person editing the object is the owner and has no one to conflict with. In this case there should be little if any ConflictErrors due to different users trying to commit changes to the same object.

Right, assuming the objects aren't indexed. If they are indexed, then modifications to the indexes could conflict. This is agrevated by the current index design, at Ty pointed out in a later message.

...

In the case of the appends, it would be like a bboard and there would be no conflict. Appends will be done in the order received.

Appends are still writes, so they would conflict. It's possible that a conflict resolution protocol, http://www.zope.org/Members/jim/ZODB/ApplicationLevelConflictResolution, could be used to cause appends to be non-conflicting. I plan to implement this protocol on the database side for Zope 2.3. You'd have to implement the application side of it. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.

9413

Age (days ago)

9417

Last active (days ago)

List overview

8 comments

5 participants

participants (5)

Evan Simpson
Jim Fulton
Jimmie Houchin
Toby Dickenson
tsarna＠endicor.com