Created on 2017-05-09 16:00 by vstinner, last changed 2017-07-06 08:50 by vstinner. This issue is now closed.
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.6/builds/128/steps/test/logs/stdio test_invalid_authentication (test.test_imaplib.NewIMAPSSLTests) ... SENT: b'* OK IMAP4rev1' GOT: b'OMOH0 CAPABILITY' SENT: b'* CAPABILITY IMAP4rev1' SENT: b'OMOH0 OK CAPABILITY completed' GOT: b'OMOH1 AUTHENTICATE MYAUTH' SENT: b'+' GOT: b'ZmFrZQ==' SENT: b'OMOH1 NO [AUTHENTICATIONFAILED] invalid' ERROR ====================================================================== ERROR: test_invalid_authentication (test.test_imaplib.NewIMAPSSLTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_imaplib.py", line 223, in _cleanup self.client.shutdown() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/imaplib.py", line 326, in shutdown self.sock.close() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 417, in close self._real_close() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/ssl.py", line 1052, in _real_close socket._real_close(self) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 411, in _real_close _ss.close(self) ConnectionResetError: [Errno 54] Connection reset by peer ---------------------------------------------------------------------- Ran 95 tests in 32.694s FAILED (errors=1, skipped=2) Warning -- threading._dangling was modified by test_imaplib Before: <_weakrefset.WeakSet object at 0x805e2b608> After: <_weakrefset.WeakSet object at 0x8014a25a0> test test_imaplib failed
See also bpo-30328 (test_ssl) and bpo-30315 (test_ftplib).
@Our SSL experts: any idea why sock.close() fails with "ConnectionResetError: [Errno 54] Connection reset by peer" in imaplib? Is that a new error? Should imaplib catch ConnectionResetError on sock.close()? By the way, see also bpo-30329: I proposed a patch to catch the Windows socket error 10022 on shutdown() for poplib and imaplib SSL connections. No idea why this error only occurs on SSL connections.
Making this an index of related reports: Issue 30319: test_imap Issue 30315: test_ftplib Issue 30543: test_timeout Issue 30328: test_ssl Issue 27784: test_asyncore.TestAPI_UseIPv6Select.test_handle_accept, test_socketserver Issue 30106: test_asyncore.TestAPI_UseIPv6Poll.test_handle_write These all look like a side effect of my change to raise an error from the OS as an exception when closing a socket, Issue 26685. Only 3.6+ is affected. According to <https://www.freebsd.org/cgi/man.cgi?close%282%29>, ECONNRESET means “The underlying object was a stream socket that was shut down by the peer before all pending data was delivered”. It seems this is specific to Free BSD. See bug report about Posix compliance: <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=159179>. According to <https://forums.zeroc.com/discussion/5569/patch-to-network-cpp-for-freebsd-for-econnreset-on-close-2-problem> this started in Free BSD 6.3 in 2006. I suppose the options are: 1. Completely revert Issue 26685 and ignore all “socket.close” errors (my least preferred option) 2. Ignore ECONNRESET in “socket.close” (backwards compatible, could use “os.close” if you really want to check for ECONNRESET) 3. Ignore ECONNRESET in the various higher-level libraries 4. Adjust the tests to ignore the error or otherwise avoid the problem
Before discussing revert, I would like experimenting fixing calls to sock.close(). How much changes do we need to ignore exeptions on sock.close()?
I think fixing all affected calls to socket.close in the world (option 3) would be too much. I just added two new reports (Issue 30652 and Issue 30391) as dependencies. These are about testing socketserver.TCPServer. As an example, to fix the socket.close call there, I think the change would look like class TCPServer: def close_request(self, request): try: request.close() except ConnectionError: # Suppress asynchronous errors such as ECONNRESET on Free BSD pass Instead of that change all over the place, I am thinking option 2 would be safest. In Modules/socketmodule.c, something like sock_close(PySocketSockObject *s) { Py_BEGIN_ALLOW_THREADS res = SOCKETCLOSE(fd); Py_END_ALLOW_THREADS /* ECONNRESET can occur on Free BSD */ if (res < 0 && errno != ECONNRESET) { return s->errorhandler(); } }
> 2. Ignore ECONNRESET in “socket.close” (backwards compatible, could use “os.close” if you really want to check for ECONNRESET) +1 from me.
Please, let's not use the "dependencies" field for issues which are not actual dependencies.
None of "27784,30106,30315,30328,30391,30652,30543" are dependencies of this issue.
tl; dr I agree to ignore ECONNRESET-like errors on socket.shutdown() and sock.close(). I'm not sure that I understand *exactly* the problem here. In which case closing a socket fails with ECONNRESET? Can it mean that the last write() was buffered and failed? If socket.send() is blocking, I expect that send() gets the ECONNRESET error. If socket.send() is non-blocking, the application has to be very carefully written to handle many kinds of errors. From a high lever point of view, I don't think that ECONNRESET is interested. I expect that a protocol at the application level doesn't rely on ECONNRESET, but a command like "QUIT". No? For a file on disk, it's different, since write() is always buffered and close() has to flush pending write on disk. For a network protocol, loosing the connection, loosing data, etc. is something "normal", not something execptional. That's why there are application-level commands to close cleanly a connection. Ok, now for SSL sockets... Is it also ok to ignore ECONNRESET on sock.close() for an SSL socket? -- ECONNRESET can occur on sock.close(), but not only: see bpo-30329, shutdown() fails can ENOTCONN on UNIX or WSAEINVAL on Windows. I modified poplib and imaplib recently to handle WSAEINVAL: commit 83a2c2879839da2e10037f5e4af1bd1dafbf1a52 Author: Victor Stinner <victor.stinner@gmail.com> Date: Mon May 15 17:33:45 2017 +0200 bpo-30329: Catch Windows error 10022 on shutdown() (#1538) Catch the Windows socket WSAEINVAL error (code 10022) in imaplib and poplib on shutdown(SHUT_RDWR): An invalid operation was attempted This error occurs sometimes on SSL connections.
About versions, socket.close() was modified in Python 3.6 (bpo-26685) to raise an OSError on error. So only Python 3.6 and 3.7 are impacted here.
Another buildbot failure which may be related: http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.6/builds/154/steps/test/logs/stdio test_write (test.test_socketserver.SocketWriterTest) ... Exception in thread <class 'socketserver.TCPServer'> serving: Traceback (most recent call last): File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 238, in serve_forever self._handle_request_noblock() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 319, in _handle_request_noblock self.handle_error(request, client_address) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 317, in _handle_request_noblock self.process_request(request, client_address) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 349, in process_request self.shutdown_request(request) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 510, in shutdown_request self.close_request(request) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 514, in close_request request.close() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 417, in close self._real_close() File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 411, in _real_close _ss.close(self) ConnectionResetError: [Errno 54] Connection reset by peer /usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/unittest/case.py:633: ResourceWarning: unclosed <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 29862), raddr=('127.0.0.1', 29860)> outcome.errors.clear() /usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/unittest/case.py:633: ResourceWarning: unclosed <socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 29860)> outcome.errors.clear() test test_socketserver failed ok ====================================================================== ERROR: test_TCPServer (test.test_socketserver.SocketServerTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 175, in test_TCPServer self.stream_examine) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/support/__init__.py", line 2045, in decorator return func(*args) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 141, in run_server testfunc(svrcls.address_family, addr) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 153, in stream_examine buf = data = receive(s, 100) File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 46, in receive raise RuntimeError("timed out on %r" % (sock,)) RuntimeError: timed out on <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 29862), raddr=('127.0.0.1', 29860)>
I propose to ignore ECONNRESET in socket.close() and ENOTCONN and WSAEINVAL on socket.shutdown(). If we see more failures later, we can extend these lists. asyncore also ignores ENOTCONN on socket.close(), but I don't trust this module at this point: it seems like asyncore also handles pipes, not only socket.socket. Errors seen on buildbots: * ECONNRESET (ConnectionResetError) on socket.close() * ENOTCONN and WSAEINVAL on socket.shutdown() -- that's why I ignored these errors in poplib and imaplib The Python standard library already ignores some errors on socket.shutdown() and socket.close() depending on the module: * poplib, imaplib: ignore ENOTCONN and WSAEINVAL on socket.shutdown() * asyncore: ignore ENOTCONN and EBADF on socket.close(). asyncore ignores much more errors on functions doing read, write or close (depending on the received event): ECONNRESET, ENOTCONN, ESHUTDOWN, ECONNABORTED, EPIPE, EBADF. In a function doing read+close, it says "winsock sometimes raises ENOTCONN". * asyncio: don't log a fatal error for EPIPE, ESHUTDOWN, ECONNRESET, ECONNABORTED errors. Lib/asyncio/base_events.py: # Exceptions which must not call the exception handler in fatal error # methods (_fatal_error()) _FATAL_ERROR_IGNORE = (BrokenPipeError, ConnectionResetError, ConnectionAbortedError) Mapping of exceptions to error codes: * BrokenPipeError: EPIPE, ESHUTDOWN * ConnectionResetError: ECONNRESET * ConnectionAbortedError: ECONNABORTED
shutdown() is not the same thing as close(). If you want to ignore errors on shutdown() you should open a separate issue (and argument for it, because "I saw the error on the buildbots" is not a sufficient reason IMHO).
Antoine Pitrou: "shutdown() is not the same thing as close()." I consider that the bug is similar to socket.close() error, since we also get an exception because the peer closed the connection. It's just the error code which is different. Antoine Pitrou: "If you want to ignore errors on shutdown() you should open a separate issue (and argument for it, because "I saw the error on the buildbots" is not a sufficient reason IMHO)." bpo-4473 modified poplib to ignore ENOTCONN on shutdown(), patch by Lorenzo Catucci committed by you :-) Commit d89824b0e2fa9a44b56394a5185de737a6527ea7. Commit 81c87c5e9a81728f4c7b022e3786ce11d53ed3c9 modified impalib to ignore ENOTCONN, commit written and pushed by you ;-) In the bpo-30329, I modified poplib and imaplib to also ignore WSAEINVAL on shutdown(). I don't see the need to separate socket.close() and socket.shutdown() error handling in two issues.
I wrote https://github.com/python/cpython/pull/2565 * socket.close() now ignores ECONNRESET error * socket.shutdown() now ignores ENOTCONN and WSAEINVAL errors
Le 04/07/2017 à 14:00, STINNER Victor a écrit : > > Antoine Pitrou: "shutdown() is not the same thing as close()." > > I consider that the bug is similar to socket.close() error, > [...] First, let's not call it a bug when an error is reported to the user. A bug would be to silence all errors just because they annoy one core developer. Second, close() and shutdown() are different functions operating at different levels. close() operates at the OS level and releases resources associated with the given file descriptor. It might, but might not, do any I/O (such as flush buffers or send a TCP RST) -- for example, if you duplicated a fd after fork(), calling close() on only one of them will do absolutely nothing at the I/O level. shutdown() operates at the transport level. Someone who calls shutdown() *must* be notified that the shutdown went wrong, because that's the only thing shutdown() does. shutdown() does nothing at the OS level. It makes sense to silence some errors in close() because, most of the time, people call close() to release resources and they don't care whether the other end of the connection was notified in time. It doesn't make sense to do the same for shutdown().
Ok, I rewrote my PR to only ignore ECONNRESET in socket.close().
I wrote https://github.com/python/cpython/pull/2565 because the ConnectionResetError error on socket.close() is more and more a cause of failures on FreeBSD buildbots. I mean that I fixed enough other bugs to start to only see one specific bug more often ;-) (Before we had dozens of random bugs.)
New changeset 67e1478dba6efe60b8e1890192014b8b06dd6bd9 by Victor Stinner in branch 'master': bpo-30319: socket.close() now ignores ECONNRESET (#2565) https://github.com/python/cpython/commit/67e1478dba6efe60b8e1890192014b8b06dd6bd9
New changeset 580cd5cd3603317d294a881628880a92ea221334 by Victor Stinner in branch '3.6': bpo-30319: socket.close() now ignores ECONNRESET (#2565) (#2566) https://github.com/python/cpython/commit/580cd5cd3603317d294a881628880a92ea221334
Ok. I modified socket.close() to ignore ECONNRESET in Python 3.6 and 3.7. I will leave the issue open a few days to see if it helps to reduce buildbot failure rate.
I don't see the link between this issue and bpo-30652 or bpo-30391, I removed these dependencies. I marked bpo-27784, bpo-30328, bpo-30543 and bpo-30315 as duplicates of this issue. bpo-30106 was already fixed, differently, but I mentioned this issue in it.
Antoine: FYI I abandonned my idea of ignoring errors on socket.shutdown(), since I agree with your rationale. An application may rely on shutdown() exception to trigger some events, and a socket can still be used after a shutdown().
Thank you Victor :-)
New changeset 8207c17486baece8ed0ac42d9f8d69ecec4ba7e4 by Victor Stinner in branch 'master': Revert "bpo-30822: Fix testing of datetime module." (#2588) https://github.com/python/cpython/commit/8207c17486baece8ed0ac42d9f8d69ecec4ba7e4
Thanks for handling this Victor. To answer some of your earlier questions, this is my understanding of the Free BSD behaviour (although I don't have Free BSD to experiment with): When writing to TCP sockets, I believe the data is buffered by the local OS (as well as the network, remote peer, etc). The send call will normally return straight away. In the background, the OS might combine the data with existing buffers, send it to the network, wait for acknowledgements, retransmit it, etc. On Free BSD, steps to trigger ECONNRESET might be: 1. Establish a TCP connection. 2. Send some data to the remote peer. OS returns immediately without indicating if data will successfully be sent. 3. Remote receives data packet, but decides the connection is not valid, so responds with reset message. Maybe its socket was shut down, or the OS rebooted. 4. Close the local socket. If TCP reset message was received in time, Free BSD raises ECONNRESET. I understand ECONNRESET is an _indication_ that not all pending data was delivered. But this is asynchronous, and a lack of ECONNRESET does not guarantee that all pending data was delivered. Imagine if steps 3 and 4 were swapped above. I doubt Free BSD will block the close call until the data is acknowledged, so it won't know if the peer will reset the connection in the future. To guarantee the data was delivered to the application (not just the remote OS), you do need an application-level acknowledgement. For SSL, when you call the top-level SSLSocket.close, I don't think that does much at the SSL level. Again, if you need delivery indication, I would use an app-level acknowledgement. Also beware that by default, Python doesn't report a secure EOF signal sent from the remote peer, so I think you either need a specific app-level message, or should disable the suppress_ragged_eofs mode (see Issue 27815). Antoine: sorry for abusing the dependencies list; I will try to avoid that in the future. It seemed the easiest way to get a two-way link to a bunch of other bugs that could be duplicates, but I wasn't sure at the time. My theory was if this bug was fixed, someone could review those other bugs and see if they could also be closed.
Thanks Martin for the long explanation. To simplify a lot, there is and was never any warranty that a successful sock.send() call delivered data to the peer. Each layer does its best, but the data can be lost at any layer, and the peer is free to close the connection *before* getting the data. Yeah, I agree that application level signaling is required.