[Zope] Re: Running more than one instance on windows often block
each other
Sune B. Woeller
sune at syntetisk.dk
Wed Jul 27 08:03:47 EDT 2005
I will try to recreate the problem on other
flavours of windows asap. I will get back to you
later.
I guess my reporting was a bit too quick, sorry:
I'm running python 2.3.5, (installed from windows binary).
Zope 2.7.7 (not necessary for the test scripts)
Windows XP Home SP2 (blush - my laptop came with that... ;) )
Sune
Tim Peters wrote:
> [Sune Brøndum Wøller]
>
>>Thanks for the pointer. I have been debugging
>>select_trigger.py, and has some more info:
>>
>>The problem is that the call a.accept() sometimes hangs.
>>Apparently a.bind(self.address) allows us to bind to
>>a port that another zope instance already is bound to.
>>
>>The code creates the server socket a, and the client socket w,
>>and gets the client socket r by connecting w to a. Then it closes a.
>>a goes out of scope when __init__ terminates, and is probably garbage
>>collected at some point.
>
>
> Unless you're using a very old Python, `a` is collected before the
> call returns (where "the call" means the call of the function in which
> `a` is a local variable). Very old Pythons had an idiotic __del__
> method attached to their Windows socket wrapper, which inhibited
> timely gc.
>
>
>>I tried moving the code to the following standalone script, and I can reproduce
>>the error with that. In the original code w is kept as an instance variable, and
>>r is passed to asyncore.dispatcher.__init__ and probably kept there.
>
>
> Yes, the socket bound to `r` also gets bound to `self.socket` by this call:
>
> asyncore.dispatcher.__init__ (self, r)
>
>
>>I simulate that by returning them, then the caller of socktest can keep them
>>around.
>>
>>I try to call socktest from different processes A and B (two pythons):
>>(w,r = socktest())
>>The call in A gets port 19999. The second call, in B, either blocks, or takes
>>over port 19999 (I see the second process taking over the port in a port scanner.)
>
>
> Sorry, I can't reproduce this -- but you didn't give a test program,
> just an isolated function, and I'm not sure what you did with it. I
> called that function in an infinite loop, appending the return value
> to a global list, with a short (< 0.1 second) sleep between
> iterations, and closed the returned sockets fifty iterations after
> they were created. Ran that loop in two processes. No hangs, or any
> other oddities, for some minutes. It did _eventually_ hang-- and both
> processes at the same time --with netstat showing more than 4000
> sockets hanging around in TIME_WAIT state then. I assume I bashed
> into some internal Windows socket resource limit there, which Windows
> didn't handle gracefully. Attaching to the processes under the MSVC 6
> debugger, they were hung inside the MS socket libraries. Repeated
> this several times (everything appeared to work fine until > 4000
> sockets were sitting in TIME_WAIT, and then both processes hung at
> approximately the same time).
>
> Concretely:
>
> sofar = []
> try:
> while 1:
> print '.',
> stuff = socktest() # calling your function
> sofar.append(stuff)
> time.sleep(random.random()/10)
> if len(sofar) == 50:
> tup = sofar.pop(0)
> w, r = tup
> msg = str(random.randrange(1000000))
> w.send(msg)
> msg2 = r.recv(100)
> assert msg == msg2, (msg, msg2)
> for s in tup:
> s.close()
> except KeyboardInterrupt:
> for tup in sofar:
> for s in tup:
> s.close()
>
> Note that there's also a bit of code there to verify that the
> connected sockets can communicate correctly; the `assert` never
> triggered.
>
> You haven't said which versions of Windows or Python you're using. I
> was using XP Pro SP2 and Python 2.3.5. Don't know whether either
> matters.
>
> It was certainly the case when I ran it that your
>
>
>> print port
>
>
> statement needed to display ports less than 19999 at times, meaning that the
>
>
>> a.bind((host, port))
>
>
> did raise an exception at times. It never printed a port number less
> than 19997 for me. Did you ever see it print a port number less than
> 19999?
>
>
>>a.bind in B does not raise socket.error: (10048, 'Address already in use') as
>>expected, when the server socket in A is closed, even though the port is used by
>>the client socket r in A.
>
>
> I'm not sure what that's saying, but could be it's an illusion. For example,
>
>
>>>>import socket
>>>>s = socket.socket()
>>>>s.bind(('localhost', 19999))
>>>>s.listen(2)
>>>>a1 = socket.socket()
>>>>a2 = socket.socket()
>>>>a1.connect(('localhost', 19999))
>>>>a2.connect(('localhost', 19999))
>>>>b1 = s.accept()
>>>>b2 = s.accept()
>>>>b1[0].getsockname()
>
> ('127.0.0.1', 19999)
>
>>>>b2[0].getsockname()
>
> ('127.0.0.1', 19999)
>
>
> That is, it's normal for the `r` in
>
>
>> r, addr = a.accept()
>
>
> to repeat port numbers across multiple `accept()` calls, and indeed to
> duplicate the port number from the `bind` call. This always confused
> me (from way back in my Unix days -- it's not "a Windows thing"), and
> maybe it's not what you're talking about anyway.
>
>
>>If I remove a.close(), and keep a around (by passing it to the caller), a.bind
>>works as expected - it raises socket.error: (10048, 'Address already in use').
>
>
> As above, I'm seeing `bind` raise exceptions regardless.
>
>
>>But in the litterature on sockets, I read it should be okay to close the server
>>socket and keep using the client sockets.
>>
>>So, is this a possible bug in bind() ?
>
>
> Sure feels that way to me, and I'm not seeing it (or don't know how to
> provoke it). But I'm not a socket expert, and am not sure I've ever
> met anyone who truly was ;-)
>
>
>>I have tested the new code from Tim Peters, it apparently works, ports are given
>>out by windows.
>>But could the same problem with bind occur here, since a is closed (and garbage
>>collected) ? (far less chance for that since we do not specify port numbers, I
>>know).
>>
>>I tried getting a pair of sockets with Tim's code, and then trying to bind a
>>third socket to the same port as a/r. And I got the same problem as above.
>
>
> Here I'm not sure what "the same problem" means, as you've described
> more than one problem. Do you mean that you get a hang? Or that you
> see suspiciously repeated port numbers? Or ...? Seeing concrete code
> might help.
>
> Last question for now: have you seen a hang on more than one flavor
> of Windows? Thanks for digging into this!
>
> [and Sune's code]
>
>>import socket, errno
>>
>>class BindError(Exception):
>> pass
>>
>>
>>def socktest():
>> """blabla
>> """
>>
>> address = ('127.9.9.9', 19999)
>>
>> a = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
>> w = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
>>
>> # set TCP_NODELAY to true to avoid buffering
>> w.setsockopt(socket.IPPROTO_TCP, 1, 1)
>>
>> # tricky: get a pair of connected sockets
>> host='127.0.0.1'
>> port=19999
>>
>> while 1:
>> print port
>> try:
>> a.bind((host, port))
>> break
>> except:
>> if port <= 19950:
>> raise BindError, 'Cannot bind trigger!'
>> port=port - 1
>>
>> a.listen (1)
>> w.setblocking (0)
>> try:
>> w.connect ((host, port))
>> except:
>> pass
>> r, addr = a.accept()
>> a.close()
>> w.setblocking (1)
>>
>> #return (a, w, r)
>> return (w, r)
>> #return w
>
> _______________________________________________
> Zope maillist - Zope at zope.org
> http://mail.zope.org/mailman/listinfo/zope
> ** No cross posts or HTML encoding! **
> (Related lists -
> http://mail.zope.org/mailman/listinfo/zope-announce
> http://mail.zope.org/mailman/listinfo/zope-dev )
>
More information about the Zope
mailing list