[ZODB-Dev] Automatic ZODB packing
Greg Ward
gward@mems-exchange.org
Wed, 16 May 2001 09:06:54 -0400
--/NkBOFFp2J2Af1nK
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On 16 May 2001, Herbert Kwong said:
> Is there any external method or Zope products that
> enable the automatic packing of ZODB when it reaches
> certain size or at a certain date? The size of my
> ZODB seems to grow quite fast so I think it is safer
> to pack
> it before it becomes really too large.
cron is your friend. We have cron job that runs nightly and packs the
database, and makes a gzip'ped backup of the pre-pack "old" file. From
the crontab:
0 2 * * 1,2,3,4,5 /www/mxpython/mems/scripts/save_db.py
I'll attach the script, but keep the following things in mind:
* this is not a Zope installation, but ZODB used on its own
* we use ZEO since we need concurrent access to the database --
ie. the web site that sits on top of this ZODB keeps
running while the pack/backup proceeds
* the script makes certain assumptions about the behaviour
of packing a FileStorage through ZEO -- see comments in the script
* you'll have to change some hard-coded paths
* you might not want the gzip'ed nightly backup -- easy enough
to remove that code
* you could of course modify the script to put in a size check
of the database file, and skip the pack/backup if it's smaller
than your threshold
Greg
--
Greg Ward - software developer gward@mems-exchange.org
MEMS Exchange http://www.mems-exchange.org
--/NkBOFFp2J2Af1nK
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="save_db.py"
#!/www/python/bin/python
"""save_db
This script will:
1) check to see if there is enough disk space to copy the relevant files
2) pack zeodb
3) copy mxdb.fs.old to a backup directory
This script needs to run as root since it kills zeo and restarts
zeo and kills quixote to restart it
Intended to be run as a cron job
"""
# created 2001/03/28, EAO
__revision__ = "$Id: save_db.py,v 1.9 2001/05/15 14:26:24 gward Exp $"
import sys, os, string, time, shutil
import traceback, StringIO
import getopt
from mems.lib import base
USAGE = """\
usage: %s [options]
save_db: pack the database and copy the backup .old file.
options:
-h, --help Display this help message
-v, --verbose Verbose mode - print status to log
"""
# File to be backed up
SOURCE_FILE_PATH = '/www/var/mxdb.fs'
# This script assumes that FileStorage is being used, and will need
# significant surgery if we switch to another Storage class. Here's
# one dependency: after packing a FileStorage, the old (unpacked) file
# is copied to mxdb.fs.old, and this is the file that we'll actually
# copy. It's also assumed that the existence of mxdb.fs.old indicates
# that the pack is complete (currently a correct assumption) and that
# it's OK to delete the .old file.
OLD_SOURCE_FILE = SOURCE_FILE_PATH + '.old'
# Directory where the backups will be written
TARGET_DIR = '/www/var/backup'
# don't copy database if number of free bytes less than FREE_DISK_SPACE after
# the copy
FREE_DISK_SPACE = 5*1024*1024 # 5Mb
# Number of minutes to wait for the pack to be completed.
WAIT_TIME = 5
LOG_FILE = '/www/log/backup.log'
class Options:
def __init__ (self):
self.help = 0
self.verbose = 0
def log (msg, threshold=1, verbose=1):
"""Output a message to stdout with a timestamp prefix (but only
if verbose >= threshold).
"""
if verbose >= threshold:
timestamp = time.strftime("[%Y-%m-%d %H:%M:%S] ",
time.localtime(time.time()))
sys.stdout.write(timestamp + msg + '\n')
def die (msg):
sys.exit("save_db: error: " + msg + " (database not backed up)\n")
def main (prog, args):
"""get the options, check for free disk space, stop zeo, copy the database,
restart zeo, pack zeo, restart quixote
"""
usage = USAGE % prog
# get new instances of the Options classe
options = Options()
opt_map = { '-h': "help",
'-v': "verbose",
'--help' : "help",
'--verbose' :"verbose",
}
# get options
try:
(opts, args) = getopt.getopt(args, "hv", ["help","verbose"])
except getopt.error, msg:
sys.exit(str(msg) + '\n\n' + usage)
for (opt, val) in opts:
attr = opt_map[opt]
val = 1
setattr(options, attr, val)
# if help option, print out usage
if options.help:
print usage
sys.exit(0)
# Create backup directory if it doesn't exist
if not os.path.exists(TARGET_DIR):
import pwd, grp
os.mkdir(TARGET_DIR)
# check to see if we have enough disk space to do the copy first
# (block size * blocks) - (2 * mxdb.fs size) (bytes)
file_size = os.stat(SOURCE_FILE_PATH)[6]
dstat = os.statvfs(TARGET_DIR)
free_space = dstat[0] * long(dstat[3])
if (free_space - 2*file_size) < FREE_DISK_SPACE:
die('not enough spare disk space for copy: would leave only %s kB' %
(free_space - file_size)/1024)
# pack the database - get rid of object versions older than the
# current time
log('packing database...', 1, options.verbose)
if os.path.exists(OLD_SOURCE_FILE):
os.unlink(OLD_SOURCE_FILE)
base.init_database()
zodb = base.get_database()
zodb.pack(time.time())
base.close_database()
# The pack() function is asynchronous; it returns immediately
# while the pack continues running in a new thread within the ZEO server.
# Therefore, we need to wait until the .old file is created, up to
# a maximum of 5 minutes.
secs = 0
while not os.path.exists(OLD_SOURCE_FILE):
secs += 1
if secs > WAIT_TIME*60:
break
time.sleep(1)
if not os.path.exists(OLD_SOURCE_FILE):
die('%s was not created after waiting %s seconds ' %
(OLD_SOURCE_FILE, secs))
log('pack completed', 1, options.verbose)
# generate timestamp to append to the backup filename
timestamp = time.strftime("%Y%m%d%H%M%S", time.localtime(time.time()))
basename = os.path.basename(SOURCE_FILE_PATH)
target_file = os.path.join(TARGET_DIR,
"%s.%s.gz" % (basename, timestamp))
# copy/compress the database
log('gzipping %s to %s' % (OLD_SOURCE_FILE, target_file),
1, options.verbose)
cmd = "gzip -c %s > %s" % (OLD_SOURCE_FILE, target_file)
status = os.system(cmd)
if status != 0:
die("gzip failed")
log('backup completed', 1, options.verbose)
if __name__ == '__main__':
main(sys.argv[0], sys.argv[1:])
--/NkBOFFp2J2Af1nK--