[ZODB-Dev] LargeFS
Tamas Hegedus
hegedus at med.unc.edu
Sat Feb 11 09:56:57 EST 2006
Hi,
I beg for your patience :-)
I run into a problem that I can not solve. It could be something very
trivial, but ZODB seems to be so simple that I can not figure out what
could be the problem.
I have to text files:
1. 800M, approx 200,000 records
2. 6G, approx 2,000,000 records
I parse them into an object (SProt.Record(Persistent)) and store in
ZODB. In order to do this I have a script that I run twice (see the code
at the bottom).
RUN#1: everything is OK, populated in approx 20 minutes, Data.fs size is
approx 800M; interestingly the *fs.tmp remains 800M after closing the
connection.
RUN#2:
- I comment out the OOBTree() line; and change the file name to parse;
- Data.fs.tmp does becomes 0 (zero)
- parsing is running; processor extensievly used; approx 500M RAM is
used by the script; no significant increase in swap usage;
- Data.fs, Data.fs.index size do not change; Data.fs.tmp stays at zero;
- data 'population' stops after the approx 1,300,000th object with the
errormessage below ('No space left on device'); although I have not run
out of space any of my hd partition...
???
I think the 2 major observations:
1. I have to figure out while I can not write into the ZODB in the
second run.
2. I tried to use commit() instead of savepoint() after every 50,000
objects; I tried to do some queries: the db populated with a lot of
commits was significantly slower w/o any serious measurments... Let say
not 0.5s but 2s.
Thanks for your time,
Tamas
-----------------------------------------------------------------------
ERROR MSG:
Traceback (most recent call last):
File "/home/hegedus/mypy/obiodb/populateUP.py", line 40, in ?
transaction.savepoint(True)
File "build/lib.linux-i686-2.4/transaction/_manager.py", line 110, in
savepoint
File "build/lib.linux-i686-2.4/transaction/_transaction.py", line
297, in savepoint
File "build/lib.linux-i686-2.4/transaction/_transaction.py", line
294, in savepoint
File "build/lib.linux-i686-2.4/transaction/_transaction.py", line
674, in __init__
File "build/lib.linux-i686-2.4/ZODB/Connection.py", line 1060, in
savepoint
File "build/lib.linux-i686-2.4/ZODB/Connection.py", line 526, in _commit
File "build/lib.linux-i686-2.4/ZODB/Connection.py", line 554, in
_store_objects
File "build/lib.linux-i686-2.4/ZODB/Connection.py", line 1188, in store
IOError: [Errno 28] No space left on device
------------------------------------------------------------------------
MY LOGs
for file #1
Fri Feb 10 22:09:12 2006
50000 Fri Feb 10 22:12:36 2006
100000 Fri Feb 10 22:16:08 2006
150000 Fri Feb 10 22:19:36 2006
200000 Fri Feb 10 22:22:56 2006
Fri Feb 10 22:28:16 2006
for file#2:
Fri Feb 10 22:40:24 2006
50000 Fri Feb 10 22:42:18 2006
100000 Fri Feb 10 22:44:21 2006
150000 Fri Feb 10 22:46:23 2006
200000 Fri Feb 10 22:48:30 2006
250000 Fri Feb 10 22:50:50 2006
300000 Fri Feb 10 22:52:59 2006
350000 Fri Feb 10 22:55:18 2006
400000 Fri Feb 10 22:57:15 2006
450000 Fri Feb 10 22:59:20 2006
500000 Fri Feb 10 23:01:32 2006
550000 Fri Feb 10 23:03:54 2006
600000 Fri Feb 10 23:06:40 2006
650000 Fri Feb 10 23:08:46 2006
700000 Fri Feb 10 23:10:51 2006
750000 Fri Feb 10 23:12:55 2006
800000 Fri Feb 10 23:15:00 2006
850000 Fri Feb 10 23:17:16 2006
900000 Fri Feb 10 23:19:12 2006
950000 Fri Feb 10 23:21:04 2006
1000000 Fri Feb 10 23:23:06 2006
1050000 Fri Feb 10 23:24:58 2006
1100000 Fri Feb 10 23:27:09 2006
1150000 Fri Feb 10 23:29:11 2006
1200000 Fri Feb 10 23:31:36 2006
1250000 Fri Feb 10 23:34:00 2006
1300000 Fri Feb 10 23:36:12 2006
1350000 Fri Feb 10 23:38:28 2006
1400000 Fri Feb 10 23:40:39 2006
1450000 Fri Feb 10 23:42:49 2006
1500000 Fri Feb 10 23:44:57 2006
------------------------------------------------------------------------
SCRIPT:
#--- imports - skipped ----
db = ZODB.config.databaseFromURL("etc/zodb.conf")
connection = db.open()
droot = connection.root()
#droot['uniprot'] = OOBTree()
upDb = droot['uniprot']
#--------------------------------------------------------------
it = SProt.Iterator( open( '/home/src/uniprot/7.0/uniprot_trembl.dat'),
#it = SProt.Iterator( open( '/home/src/uniprot/7.0/uniprot_sprot.dat'),
SProt.RecordParser())
ofile = open( '/home/hegedus/mypy/obiodb/docs/myLogFileN.txt', 'w')
i = 0
ofile.write( time.asctime() + '\n')
for rec in it:
acc = copy.deepcopy( rec.accessions[0])
upDb[acc] = rec
i += 1
if i % 50000 == 0:
transaction.savepoint(True)
ofile.write( "%s\t%s\n" % (i , time.asctime()))
ofile.flush()
transaction.commit()
connection.close()
print time.asctime()
--
Tamas Hegedus, PhD | phone: (1) 919-966 0329
UNC - Biochem & Biophys | fax: (1) 919-966 5178
5007A Thurston-Bowles Bldg | mailto:hegedus at med.unc.edu
Chapel Hill, NC, 27599-7248 | http://biohegedus.org
More information about the ZODB-Dev
mailing list