[ZODB-Dev] Very large amounts of output from fsrefs.py

Ingvar Hagelund ingvar at linpro.no
Wed Aug 27 07:32:53 EDT 2008


Hello, list

We have an solution based on a third party zope based framework, with 12
Zope instances accessing ZODB via ZEO. Backend storage is Data.fs. For
reading we use Zeo client caching, and a Squid reverse proxy in front of
Zope, giving very good performance for most of the end users. ZEO/ZODB
is running on a machine separate from the ones running Zope. The Zope
instances used for write access are on a separate machine from the ones
used for serving output for end users.

Now, writing to the ZEO database is very slow, and the same goes for
searching, which is a problem for the users that put information into
the solution.

At this time, we can't give much info on the data structure used in the
database, but trust that the application is, uhm, well designed.

We want to sort out if there are any obvious data points on tuning ZEO
that we have missed, or that there are errors or corruption in the database.

Here are som key points:

Software environment: ZEO from Zope 2.10.3, python-2.4.4, Debian
GNU/Linux, Data.fs on ext3 filesystem

Hardware environment: 2xIntel Xeon dualcore 3.20GHz, 16GB RAM, Data.fs
on LSI-based MegaRAID-controller

Average cpu usage on the box running ZEO is some 10%, and never above
65% (of one cpu core).

The size of Data.fs is quite large, about 8.2GB after packing.

The database includes quite a large number of blobs, like pictures,
pdfs, documents and such.

Running fsrefs.py on Data.fs gives 1507079 instances of "refers to
invalid objects". This might indicate a problem, I would guess.

The zeo.conf is attached below.

Now for some questions:

- Is a configuration like the one described a sensible setup?
- Are there any ovious key tuning buttons that we have missed?
- Is it possible to run a "fsck", or, like, "export/reimport" the
  Data.fs, and thus repair/remove invalid references?
- Would it be an idea to move binaries out of Data.fs and serve them
  from a shared filesystem instead?

Any other remarks are very welcome.

Regards,
Ingvar Hagelund



# zeo.conf
%define INSTANCE /var/lib/zope/zeo

<zeo>
  address 9998
  read-only false
  invalidation-queue-size 3000
  monitor-address 0.0.0.0:8101
</zeo>

<filestorage 1>
  path $INSTANCE/var/Data.fs
</filestorage>

<eventlog>
  level info
  <logfile>
    path $INSTANCE/log/zeo.log
  </logfile>
</eventlog>

<runner>
  program $INSTANCE/bin/runzeo
  socket-name $INSTANCE/etc/zeo.zdsock
  daemon true
  forever false
  backoff-limit 10
  exit-codes 0, 2
  directory $INSTANCE
  default-to-interactive true
  python /usr/local/zope/python2.4.4/bin/python2.4
  zdrun /usr/local/zope/2.10.3/lib/python/zdaemon/zdrun.py

  logfile $INSTANCE/log/zeo.log
</runner>



More information about the ZODB-Dev mailing list