[Zope] Re: Running more than one instance on windows often block each other

Sune B. Woeller sune at syntetisk.dk
Wed Jul 27 08:03:47 EDT 2005


I will try to recreate the problem on other
flavours of windows asap. I will get back to you
later.

I guess my reporting was a bit too quick, sorry:
I'm running python 2.3.5, (installed from windows binary).
Zope 2.7.7 (not necessary for the test scripts)
Windows XP Home SP2 (blush - my laptop came with that... ;) )

Sune


Tim Peters wrote:
> [Sune Brøndum Wøller]
> 
>>Thanks for the pointer. I have been debugging
>>select_trigger.py, and has some more info:
>>
>>The problem is that the call a.accept() sometimes hangs.
>>Apparently a.bind(self.address) allows us to bind to
>>a port that another zope instance already is bound to.
>>
>>The code creates the server socket a, and the client socket w,
>>and gets the client socket r by connecting w to a. Then it closes a.
>>a goes out of scope when __init__ terminates, and is probably garbage
>>collected at some point.
> 
> 
> Unless you're using a very old Python, `a` is collected before the
> call returns (where "the call" means the call of the function in which
> `a` is a local variable).  Very old Pythons had an idiotic __del__
> method attached to their Windows socket wrapper, which inhibited
> timely gc.
> 
> 
>>I tried moving the code to the following standalone script, and I can reproduce
>>the error with that. In the original code w is kept as an instance variable, and
>>r is passed to asyncore.dispatcher.__init__  and probably kept there.
> 
> 
> Yes, the socket bound to `r` also gets bound to `self.socket` by this call:
> 
>     asyncore.dispatcher.__init__ (self, r)
> 
> 
>>I simulate that by returning them, then the caller of socktest can keep them
>>around.
>>
>>I try to call socktest from different processes A and B (two pythons):
>>(w,r = socktest())
>>The call in A gets port 19999. The second call, in B, either blocks, or takes
>>over port 19999 (I see the second process taking over the port in a port scanner.)
> 
> 
> Sorry, I can't reproduce this -- but you didn't give a test program,
> just an isolated function, and I'm not sure what you did with it.  I
> called that function in an infinite loop, appending the return value
> to a global list, with a short (< 0.1 second) sleep between
> iterations, and closed the returned sockets fifty iterations after
> they were created.  Ran that loop in two processes.  No hangs, or any
> other oddities, for some minutes.  It did _eventually_ hang-- and both
> processes at the same time --with netstat showing more than 4000
> sockets hanging around in TIME_WAIT state then.  I assume I bashed
> into some internal Windows socket resource limit there, which Windows
> didn't handle gracefully.  Attaching to the processes under the MSVC 6
> debugger, they were hung inside the MS socket libraries.  Repeated
> this several times (everything appeared to work fine until > 4000
> sockets were sitting in TIME_WAIT, and then both processes hung at
> approximately the same time).
> 
> Concretely:
> 
> sofar = []
> try:
>     while 1:
>         print '.',
>         stuff = socktest()  # calling your function
>         sofar.append(stuff)
>         time.sleep(random.random()/10)
>         if len(sofar) == 50:
>             tup = sofar.pop(0)
>             w, r = tup
>             msg = str(random.randrange(1000000))
>             w.send(msg)
>             msg2 = r.recv(100)
>             assert msg == msg2, (msg, msg2)
>             for s in tup:
>                 s.close()
> except KeyboardInterrupt:
>     for tup in sofar:
>         for s in tup:
>             s.close()
> 
> Note that there's also a bit of code there to verify that the
> connected sockets can communicate correctly; the `assert` never
> triggered.
> 
> You haven't said which versions of Windows or Python you're using.  I
> was using XP Pro SP2 and Python 2.3.5.  Don't know whether either
> matters.
> 
> It was certainly the case when I ran it that your
> 
> 
>>        print port
> 
> 
> statement needed to display ports less than 19999 at times, meaning that the
> 
> 
>>            a.bind((host, port))
> 
> 
> did raise an exception at times.  It never printed a port number less
> than 19997 for me.  Did you ever see it print a port number less than
> 19999?
> 
> 
>>a.bind in B does not raise socket.error: (10048, 'Address already in use') as
>>expected, when the server socket in A is closed, even though the port is used by
>>the client socket r in A.
> 
> 
> I'm not sure what that's saying, but could be it's an illusion.  For example,
> 
> 
>>>>import socket
>>>>s = socket.socket()
>>>>s.bind(('localhost', 19999))
>>>>s.listen(2)
>>>>a1 = socket.socket()
>>>>a2 = socket.socket()
>>>>a1.connect(('localhost', 19999))
>>>>a2.connect(('localhost', 19999))
>>>>b1 = s.accept()
>>>>b2 = s.accept()
>>>>b1[0].getsockname()
> 
> ('127.0.0.1', 19999)
> 
>>>>b2[0].getsockname()
> 
> ('127.0.0.1', 19999)
> 
> 
> That is, it's normal for the `r` in
> 
> 
>>    r, addr = a.accept()
> 
> 
> to repeat port numbers across multiple `accept()` calls, and indeed to
> duplicate the port number from the `bind` call.  This always confused
> me (from way back in my Unix days -- it's not "a Windows thing"), and
> maybe it's not what you're talking about anyway.
> 
> 
>>If I remove a.close(), and keep a around (by passing it to the caller), a.bind
>>works as expected - it raises socket.error: (10048, 'Address already in use').
> 
> 
> As above, I'm seeing `bind` raise exceptions regardless.
> 
> 
>>But in the litterature on sockets, I read it should be okay to close the server
>>socket and keep using the client sockets.
>>
>>So, is this a possible bug in bind() ?
> 
> 
> Sure feels that way to me, and I'm not seeing it (or don't know how to
> provoke it).  But I'm not a socket expert, and am not sure I've ever
> met anyone who truly was ;-)
> 
> 
>>I have tested the new code from Tim Peters, it apparently works, ports are given
>>out by windows.
>>But could the same problem with bind occur here, since a is closed (and garbage
>>collected) ? (far less chance for that since we do not specify port numbers, I
>>know).
>>
>>I tried getting a pair of sockets with Tim's code, and then trying to bind a
>>third socket to the same port as a/r. And I got the same problem as above.
> 
> 
> Here I'm not sure what "the same problem" means, as you've described
> more than one problem.  Do you mean that you get a hang?  Or that you
> see suspiciously repeated port numbers?  Or ...?  Seeing concrete code
> might help.
> 
> Last question for now:  have you seen a hang on more than one flavor
> of Windows?  Thanks for digging into this!
> 
> [and Sune's code] 
> 
>>import socket, errno
>>
>>class BindError(Exception):
>>    pass
>>
>>
>>def socktest():
>>    """blabla
>>    """
>>
>>    address = ('127.9.9.9', 19999)
>>
>>    a = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
>>    w = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
>>
>>    # set TCP_NODELAY to true to avoid buffering
>>    w.setsockopt(socket.IPPROTO_TCP, 1, 1)
>>
>>    # tricky: get a pair of connected sockets
>>    host='127.0.0.1'
>>    port=19999
>>
>>    while 1:
>>        print port
>>        try:
>>            a.bind((host, port))
>>            break
>>        except:
>>            if port <= 19950:
>>                raise BindError, 'Cannot bind trigger!'
>>            port=port - 1
>>
>>    a.listen (1)
>>    w.setblocking (0)
>>    try:
>>        w.connect ((host, port))
>>    except:
>>        pass
>>    r, addr = a.accept()
>>    a.close()
>>    w.setblocking (1)
>>
>>    #return (a, w, r)
>>    return (w, r)
>>    #return w
> 
> _______________________________________________
> Zope maillist  -  Zope at zope.org
> http://mail.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists - 
>  http://mail.zope.org/mailman/listinfo/zope-announce
>  http://mail.zope.org/mailman/listinfo/zope-dev )
> 




More information about the Zope mailing list