[ZODB-Dev] Connection pool makes no sense
Tim Peters
tim at zope.com
Thu Dec 29 10:46:17 EST 2005
[??????? ????? ?????????????]
> Hi. A little bit of history... We have zope as an application server for
> heavy loaded tech process. We have high peaks of load several times a day
> and my question is about how can we can avoid unused connections to
> remain in memory after peak is passed? Before ZODB-3.4.1 connection pool
> was fixed size of pool_size and that caused zope to block down while load
> peaks. ZODB-3.4.2 that is shipped with Zope-2.8.5 has connection pool
> that does not limit the opened connections, but tries to reduce the pool
> to the pool_size and this behavior is broken IMO.
>
> Follow my idea... After peak load I have many (thousands of connections)
> that have cached up different objects including RDB connections. Those
> connections NEVER used after that. Why... well, because connection pool
> doesn't work correctly with _reduce_size method.
>
> # Throw away the oldest available connections until we're under our
> # target size (strictly_less=False) or no more than that
> # (strictly_less=True, the default).
> def _reduce_size(self, strictly_less=False):
> target = self.pool_size - bool(strictly_less)
> while len(self.available) > target:
> c = self.available.pop(0)
> self.all.remove(c) <--- Does this mean that we want to
> delete connection object from the memory?
No, it means that _ConnectionPool (the class to which method reduce_size()
belongs) no longer wishes to remember anything about connection `c`.
Nothing in _ConnectionPool _prevents_ `c` from going away then, but
references in application code can keep `c` alive for an arbitrarily long
time after this.
> If yes then why we use remove method of weakSet object?
_ConnectionPool no longer has any reason to remember anything about `c`, so
it would be wasteful for it to continue burning RAM keeping `c` in its weak
set. If ill-behaved application code is keeping `c` alive,
_ConnectionPool.all could grow without bound otherwise.
> It's nonsense.
It's defensive coding, protecting _ConnectionPool from some bad effects of
ill-behaved application code.
> # Same as a Set, remove obj from the collection, and raise
> # KeyError if obj not in the collection.
> def remove(self, obj):
> del self.data[id(obj)] <--- This just removes weekref from the
> weakValueDictionary not the object...
Any time you do
del dict[key]
on a dictionary `dict`, `key` is removed from `dict`, and so is the
associated value `dict[key]` (which is a weakref to `obj` in the snippet
above). The only _strong_ reference _ConnectionPool had to `c` was in its
.available queue, and that's gone too.
> So if we are willing to destroy obj - we are are wrong way here...
Sorry, it looks fine to me.
> Ok. Lets look at pop, push and repush...
>
> ...
>
> We do pop connection from self.available and do push and repush used
> connection to self.available, so there is no other way to obtain
> connection. Only from self.available. Good. But... look what _reduce_size
> method does. It pops first connection and tries to remove it in case of
> larger size of self.available list, so the connection is not in the
> self.available list and nobody can obtain it any more. Good... but it's
> good just in case of deletion of connection object from the memory. But
> it's still there!!! and it's cache too with opened RDB connections that
> will never serve anyone.
Then it's most likely that application code is retaining one or more strong
references to it. ZODB can't stop application code from doing that.
> I don't know if there is some other way to return connection to the pool.
A Connection `c` is returned to the pool as a result of explicitly calling
c.close().
> I do not know ZODB as a whole thing, but accordingly to the logic that I
> can see in these pieces of code and what I see every day after load peaks
> makes me believe that connection object SHOULD be deleted with the cached
> objects from the memory.
ZODB has never done that -- close() has always returned a Connection to a
pool for potential reuse, with its cache intact. Most people call that an
important optimization, because most people (meaning Zope ;-)) have only a
handful of Connections open at a time, and get real value out of reusing
cache content.
> And this can be done definitely not by deleting weakref.
The primary purpose to deleting the weakref is to prevent unbounded memory
growth of the .all set in the face of ill-behaved application code. It also
speeds Python's gc to destroy weakrefs that are no longer needed (otherwise
Python's gc has to spend time analyzing them).
> Or it seams to me as a memory leak.
If application code keeps strong references to closed Connection objects,
then yes, they'll certainly leak.
> I think better logic would be to have idle period along with pool_size.
> We should remove oldest connection from the pool that was not used for
> idle period. So we can have small pool_size and small connection pool
> that can grow with site load and shrink with low site activity.
Probably more productive to figure out the true cause of the leak first. It
_could_ be that you're opening Connections at such a furious pace that
Python's cyclic gc isn't running often enough to keep up with you; inserting
some calls to gc.collect() may (or may not ...) make a difference then.
> But. We can not delete connection object by del(obj) where obj is
> instance of connection class.
Nothing tries to do _that_. The code above works uses id(obj) as the key,
not `obj`.
> connection._cache does incref of connection object and connection._reader
> does the same. So we need to del them first.
A Connection and its cache reference each other. Refcounting isn't enough
to get rid of either; they only go away when Python's cyclic gc runs and
finds them in a trash _cycle_.
More information about the ZODB-Dev
mailing list