[ZODB-Dev] Experiences with Spread?
Tim Peters
tim at zope.com
Mon Jan 3 15:27:29 EST 2005
[A.M. Kuchling]
> I'm thinking about applying the Spread toolkit to a data replication
> problem. Because the Zope Replication Service uses Spread, I'd like to
> ask about your experiences with Spread. Did it work OK for you? Does it
> seem reasonably free of bugs? Any design, configuration, or usage issues
> to be aware of?
Spread works fine, although the Spread users list is probably a better place
to ask for dirt.
Use of Spread in ZRS may be overkill given the relatively simple one-way
replication ZRS supplies. We picked Spread for ZRS when more ambitious
plans seemed more imminently realistic, partly based on (the more ambitious)
Postgres-R's use of Spread.
Some downsides in practice:
- As for all systems that need to be told about network topology, Spread
configuration is delicate and utterly unforgiving. I don't know why,
but sysadmins seem to have a hard time keeping config files in
synch across machines. Earlier versions of Spread exacerbated this
problem by failing to do even simple sanity checks (name too long,
name duplicated, ...) on spread.conf files. This is better in the
current Spread, in part based on our feedback about typos in ZRS users'
spread.conf files that caused no end of grief. Getting Spread
running has usually been a real effort, but has usually come down to
no more than that Spread's config files on participating machines
contained incorrect info about the actual network topology, and/or
inconsistent info across participating machines (e.g., calling a machine
"andrew" in one box's spread.conf but "amk" in another's). IOW, pilot
error. Unfortunately, also as for other networked systems, so long as
the config is incorrect the only real symptom is "huh -- nothing seems
to be happening".
- Spread has extensive logging facilities, but you can't change what's
being logged short of restarting Spread. When an error is logged,
chances are high it won't make any sense to you; OTOH, the logging is
good in the sense that if you post the relevant piece of the log file
to the Spread user's list, one of the Spread developers can usually
deduce a lot from it.
- There doesn't appear to be a practical way to rotate Spread log
files short of restarting Spread. Some attempts at piping Spread's
log output to a process that did its own rotation didn't work out,
although I don't recall any details; there was something peculiar
about Spread's output behavior that interfered with "the obvious"
workarounds.
- Spread doesn't work in the presence of NAT. "Real" network addresses
get embedded in Spread packets, and NAT breaks that.
- The Python Spread wrapper module is suffering from neglect, and is
still set up to work with Spread 3.17.1. I don't even know if it
*can* work with 3.17.3; I do know it at least needs changes to work
with 3.17.3 on Windows because the Spread project changed the names
of some files on Windows. OTOH, there are no bug reports open against
it, apart from one crazy bug due to someone changing symbols in
Spread's .h files and getting into alignment problems as a result.
So it's somewhat out of date, but solid.
More information about the ZODB-Dev
mailing list