Socket timeout can cause file-like readline() method to lose data #51571

beazley · 2009-11-14T17:16:32Z

BPO	7322
Nosy	@loewis, @gpshead, @amauryfa, @pitrou, @ned-deily, @bitdancer, @florentx, @vadmium
Files	test-issue7322.py: test case 7322_v1.patch: Patch + test for issue issue7322_new.patch i7322.patch: Disallow further reads after timeout

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2009-11-14.17:16:31.739>
labels = ['type-bug', 'library']
title = 'Socket timeout can cause file-like readline() method to lose data'
updated_at = <Date 2015-10-27.11:52:45.924>
user = 'https://bugs.python.org/beazley'

bugs.python.org fields:

activity = <Date 2015-10-27.11:52:45.924>
actor = 'martin.panter'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2009-11-14.17:16:31.739>
creator = 'beazley'
dependencies = []
files = ['19711', '20202', '20386', '20398']
hgrepos = []
issue_num = 7322
keywords = ['patch']
message_count = 27.0
messages = ['95245', '121777', '121779', '121833', '121839', '121885', '124957', '124962', '126163', '126170', '126171', '126172', '126190', '126192', '126197', '126198', '126228', '126296', '126297', '129468', '146025', '146038', '146040', '146048', '146049', '253437', '253530']
nosy_count = 12.0
nosy_names = ['loewis', 'beazley', 'gregory.p.smith', 'amaury.forgeotdarc', 'roysmith', 'pitrou', 'ned.deily', 'r.david.murray', 'flox', 'dabeaz', 'rosslagerwall', 'martin.panter']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue7322'
versions = ['Python 3.2', 'Python 3.3']

beazley · 2009-11-14T17:16:30Z

Consider a socket that has had a file-like wrapper placed around it
using makefile()

# s is a socket created previously
f = s.makefile()

Now, suppose that this socket has had a timeout placed on it.

s.settimeout(15)

If you try to read data from f, but nothing is available. You'll
eventually get a timeout. For example:

f.readline()   # Now, just wait
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/socket.
py", line 406, in readline
    data = self._sock.recv(self._rbufsize)
socket.timeout: timed out

However, now consider the case where you're reading a line of data, but
the receiver has only received a partial line and it's waiting for the
rest of the data to arrive. For example, type this:

f.readline()

Now, go to the other end of the socket connection and send a buffer with
no newline character. For example, send the message "Hello".

Since no newline character has been received, the readline() method will
eventually fail with a timeout as before. However, if you now retry
the read operation f.readline() and send more data such as the message
"World\n", you'll find that the "Hello" message gets lost. In other
words, the repeated readline() operation discards any buffers
corresponding to previously received line data and just returns the new
data.

Admittedly this is a corner case, but you probably don't want data to be
discarded on a TCP connection even if a timeout occurs.

Hope that makes some sense :-). (It helps to try it out).

roysmith · 2010-11-20T21:08:57Z

I'm looking into this now. In the meantime, I've opened a marginally-related bug, bpo-10473

roysmith · 2010-11-20T21:14:11Z

Ataching a test case which demonstrates the bug.

ned-deily · 2010-11-21T02:05:01Z

This would seem to be an invalid test case. It is specifically documented that socket.makefile does not support this: "The socket must be in blocking mode (it can not have a timeout)".

http://docs.python.org/py3k/library/socket.html#socket.socket.makefile

I suppose socket.makefile could initially check the socket and throw an exception if the socket is non_blocking but the program could later change the socket to non_blocking and the same issue would presumably arise.

Recommend closing as invalid.

roysmith · 2010-11-21T02:36:03Z

This is kind of ugly. On the one hand, I'm all for adding a check in makefile() to catch it being called on a non-blocking socket.

On the other hand, you are correct that a user could change the mode leter. Even if we added checks for this in socket.setblocking(), there's plenty of ways to get around that; it's easy to grab a raw file descriptor and do whatever you want with it behind the library's back.

On the third hand, maybe a check could be added to SocketIO.readinto() to verify that the socket was in blocking mode each time it was called?

ned-deily · 2010-11-21T08:45:24Z

I see bpo-7995 also addresses the issue of accept sockets inheriting nonblocking status and provides a suggested patch.

rosslagerwall · 2010-12-31T08:14:59Z

Attached is a patch which fixes the issue.

Instead of allowing the readline method to lose data, it adds a check to SocketIO.readinto() to ensure that the socket does not have a timeout and throws an IOError if it does. Also does the same for SocketIO.write().

I think this is a better approach - just failing immediately when a readline on a nonblocking socket occurs instead of failing sometimes and losing data.

pitrou · 2010-12-31T10:47:38Z

While this patch looks conformant to the documentation, it is very likely to break code in the wild. Even in the stdlib, there are uses of makefile() + socket timeouts (e.g. in http.client and urllib). It would be better to find a way to make readline() functional even with socket timeouts.

rosslagerwall · 2011-01-13T09:27:59Z

How about this?

Instead of just losing the data that's been read so far in readline(), this patch adds the data as a new field to the exception that is thrown - this way the semantics remain exactly the same but the data is not discarded when a timeout occurs, it is still accessible via the exception.

dabeaz · 2011-01-13T14:10:29Z

Have any other programming environments ever had a feature where a socket timeout returns an exception containing partial data? I'm not aware of one offhand and speaking as a systems programmer, something like this might be somewhat unexpected.

My concern is that in the presence of timeouts, the programmer will be forced to reassemble the message themselves from fragments returned in the exception. However, one reason for using readline() in the first place is precisely so that you don't have to do that sort of thing.

Is there any reason why the input buffer can't be preserved across calls? You've already got a file-like wrapper around the socket. Just keep the unconsumed buffer in that instance.

pitrou · 2011-01-13T14:24:09Z

This is an interesting approach. The problem is that AFAICT the issue is not limited to readline. If you call e.g. read(10000) and the socket times out after having returned the first 5000 bytes, then those 5000 bytes might get lost as well (depending on specifics e.g. buffer size in the IO stack).

Generally there is no guarantee that a buffered object works "properly" when the raw IO object raises some exception intermittently; perhaps this should be fixed in a systemic way, although this would complicate things quite a bit.

Also, I don't think people try to reuse a socket after a timeout (or even try to salvage whatever data could be read before the timeout); usually they would instead abort the connection and treat the remote resource as unavailable. IMO, that's the one obvious use case for socket timeouts.

pitrou · 2011-01-13T14:27:50Z

By the way, I recently fixed the makefile() documentation:

“The socket must be in blocking mode; it can have a timeout, but the file object’s internal buffer may end up in a inconsistent state if a timeout occurs.”
(in http://docs.python.org/dev/library/socket.html#socket.socket.makefile)

I also added a small section dedicated to socket timeouts:
http://docs.python.org/dev/library/socket.html#notes-on-socket-timeouts

gpshead · 2011-01-13T17:47:14Z

"""Generally there is no guarantee that a buffered object works "properly" when the raw IO object raises some exception intermittently"""

I disagree. EINTR is a classic case of this and is something that buffering IO layers deal with all the time. (readline is just one example of a buffering io layer)

if there is a timeout and we can't determine if there is enough data to return for readline, we should buffer it and not return.

maybe this means we need to disallow readline() with timeouts on unbuffered sockets since we can't determine if data will need to be buffered or not due to such a condition in advance.

The normal behavior for code calling readline() on a socket with a timeout is likely going to be to close it. Anything else does not make much sense. (someone may try, but really they're writing their I/O code wrong if they are using a socket timeout a poor form of task switching ;)

pitrou · 2011-01-13T18:06:31Z

"""Generally there is no guarantee that a buffered object works
"properly" when the raw IO object raises some exception
intermittently"""

I disagree. EINTR is a classic case of this and is something that
buffering IO layers deal with all the time. (readline is just one
example of a buffering io layer)

EINTR is a different matter. To handle EINTR in Python, it is enough to
call the signal handlers and then retry the system call (that's what is
done in SocketIO.readinto, although FileIO doesn't have such logic).
Only if the signal handler raises an exception (which it probably
shouldn't do, since asynchronous exceptions are very bad) do you abort
the operation.

You can't apply the same logic to a socket timeout; the timeout is
really an error condition and you certainly shouldn't retry the system
call (that would defeat the point of using a timeout). So, to handle it
in an entirely correct way, you need to add some out-of-band buffering
logic where you store the pending raw reads which have been done but
could not be returned to the user. That complicates things quite a bit,
especially given that it has to be grafted on at least two layers of the
IO stack (the raw IO layer, and the buffered IO layer). Ross' patch does
it, but incompletely (it lets the user handle the out-of-band data) and
only for readline() (while buffered read() would probably need it too).

The normal behavior for code calling readline() on a socket with a
timeout is likely going to be to close it. Anything else does not
make much sense. (someone may try, but really they're writing their
I/O code wrong if they are using a socket timeout a poor form of task
switching ;)

That's my opinion too. So, instead, of doing the above surgery inside
the IO stack, the SocketIO layer could detect the timeout and disallow
further access. What do you think?

rosslagerwall · 2011-01-13T20:25:02Z

That complicates things quite a bit,
especially given that it has to be grafted on at least two layers of the
IO stack (the raw IO layer, and the buffered IO layer).

Also the TextIO layer I think.

That's my opinion too. So, instead, of doing the above surgery inside
the IO stack, the SocketIO layer could detect the timeout and disallow
further access. What do you think?

So after a timeout occurs the file-object basically becomes worthless? Would it make sense to automatically call the close method of the file-object after this occurs?

pitrou · 2011-01-13T20:34:59Z

> That's my opinion too. So, instead, of doing the above surgery inside
> the IO stack, the SocketIO layer could detect the timeout and disallow
> further access. What do you think?

So after a timeout occurs the file-object basically becomes worthless?
Would it make sense to automatically call the close method of the
file-object after this occurs?

Actually, we only need to forbid further reads (writes would always
work). I think we should still let the user call the close method
themselves.

rosslagerwall · 2011-01-14T05:21:00Z

Attached patch disallows further reads after a timeout.

pitrou · 2011-01-14T19:51:19Z

Looks good and simple enough. I would probably shift the timeout test after the closed test, but that's almost a detail.

dabeaz · 2011-01-14T20:00:51Z

Just wanted to say that I agree it's nonsense to continue reading on a socket that timed out (I'm not even sure what I might have been thinking when I first submitted this bug other than just experimenting with edge cases of the socket interface). It's still probably good to precisely specify what the behavior is in any case.

pitrou · 2011-02-25T23:15:20Z

Committed in r88622 (3.3) and r88623 (3.2). The 2.7 implementation is too different for the patch to apply, so if you want to fix it too, feel free to upload a patch. Thank you!

bitdancer · 2011-10-20T16:27:47Z

This patch has caused a non-trivial regression between 3.2 and 3.2.1. The scenario in which I observed it is poplib. I create a POP3 connection with a timeout. At one point in its processing, poplib is reading lines until it gets a line '.\r\n', at which point the transaction is complete and it returns data to the caller. If the pop server fails to terminate the transaction, we get a timeout on the read. However, the POP server may still be alive, it may just have failed to close the transaction (servers have been observed in the wild that do this[*]). Before this patch, one could catch the socket.timeout and recover from the failed transaction (loosing the transaction data, but that's OK because the transaction was incomplete...it would be better to get the partial transaction, but that's a poplib issue, not a socket issue). One could then continue processing, sending new transactions to the POP server and getting responses. After the patch, once the socket error is raised there is no way to continue poplib processing short of tearing down the connection and rebuilding it, and restarting the POP processing from the beginning.

Now, this is clearly an abnormal situation (a POP server randomly not completing its transactions), but it was observed in the wild, and does represent a regression. I think that Antoine's idea of making readline functional despite timeouts was the better approach.

Also note that Antoine's change to the makefile documentation is wrong with this patch in place, since a timeout invalidates the makefile rather than just "leaving the internal buffers in an inconsistent state".

Backing out this patch would probably be better than leaving it in place, if a better fix can't be found.

[*] The regression was detected testing against a test POP server designed to exhibit defective behaviors that have been observed over the years by the maintainers of the test server. I can't point to specific existing servers that exhibit the broken behavior, but it did happen in the past and no doubt someone will write a buggy POP server that has the same broken behavior some time in the future as well.

pitrou · 2011-10-20T17:38:34Z

One could then continue processing, sending new transactions to the POP
server and getting responses.

That's optimistic. You don't know how much data has been lost in readline(). Sure, again your test server, it happens to work :) But again other kinds of failing servers, your protocol session would end up confused.
So the only robust option is the following:

tearing down the
connection and rebuilding it, and restarting the POP processing from
the beginning

bitdancer · 2011-10-20T18:01:14Z

I don't think it is optimistic. The poplib transaction pattern is: send a command, get a response. If the response is not properly terminated, throw it away. Send a new command, get a response. There's no ambiguity there. In addition, this is a common tcp client-server model, so I think it applies more widely than just poplib.

Please note that the timeout is *not* because the socket data transmission has timed out and data was lost in transit. There are no partially filled readline buffers in this scenario. The timeout is because the client is waiting for a *line* of data that the server never sends. Again, this is likely to be a common failure mode in tcp client/server applications, and to my mind is exactly what the timeout parameter to the constructor is most useful for.

gpshead · 2011-10-20T20:33:31Z

If the server failed to close a transaction the protocol stream is over
unless you mime relying on hope and luck. Poplib has a nasty set of server
implementation bugs to work around here.

Readline as defined today no longer suits its needs but I still strongly
believe the behavior of shutting reading down after a timeout is a good one.

One thing that would solve your common case: don't shut down reading on a
readline timeout if zero data was received into the internal line buffer.
Readline could indicate this by modifying the timeout exception being raised
to indicate if it can recover or not. The flag should only be set if it was
unrecoverable.
On Oct 20, 2011 11:01 AM, "R. David Murray" <[email protected]> wrote:

R. David Murray <[email protected]> added the comment:

I don't think it is optimistic. The poplib transaction pattern is: send a
command, get a response. If the response is not properly terminated, throw
it away. Send a new command, get a response. There's no ambiguity there.
In addition, this is a common tcp client-server model, so I think it
applies more widely than just poplib.

Please note that the timeout is *not* because the socket data transmission
has timed out and data was lost in transit. There are no partially filled
readline buffers in this scenario. The timeout is because the client is
waiting for a *line* of data that the server never sends. Again, this is
likely to be a common failure mode in tcp client/server applications, and to
my mind is exactly what the timeout parameter to the constructor is most
useful for.

----------

Python tracker <[email protected]>
<http://bugs.python.org/issue7322\>

bitdancer · 2011-10-20T20:40:43Z

Your suggestion sounds good to me.

I still think that it is a common failure mode in a client server transaction for the server to fail to send a (complete) line that the client is expecting, and vice versa, requiring a timeout, but not necessarily a "restart from scratch". Often the client/server protocol has a useful checkpoint to restart from short of start from scratch. In the case of many protocols, that would be "client issues a new command".

dabeaz · 2015-10-25T21:28:20Z

This bug is still present in Python 3.5, but it occurs if you attempt to do a readline() on a socket that's in non-blocking mode. In that case, you probably DO want to retry at a later time (unlike the timeout case).

vadmium · 2015-10-27T11:52:45Z

IMO it might make sense in some cases to disallow subsequent reading from a buffered socket reader (or probably any BufferedReader) that raises an exception (not just a timeout). But the restriction seems unnecessary for unbuffered raw readers, and it also seems to go against the “consenting adults” philosophy for David Murray’s test server case.

David Beazley: For non-blocking sockets, the documentation currently says “the socket must be in blocking mode”. I’m not sure why that restriction is necessary; maybe it could be lifted at least for raw unbuffered streams.

Maybe you could make an argument for caching the partial data in the BufferedReader if a timeout (or no more non-blocking data, or other exception) occurs. The biggest problem is that it could mean storing more than the normal buffer size. I would think this would be a new feature (for 3.6+) rather than a behavioural bug fix though.

And see also bpo-13322 about inconsistencies with buffered reading non-blocking streams in general.

foresto · 2024-12-13T01:06:32Z

The normal behavior for code calling readline() on a socket with a timeout is likely going to be to close it. Anything else does not make much sense.

It makes perfect sense in certain scenarios.

Consider a protocol where inversion of control is used, and the server sends single-line messages at unpredictable times. The client waits for and processes them, but might need to do something else every so often before waiting for more, and will eventually want to send a DONE command to return to client-initiated operations.

Setting a socket timeout before each readline() would handle this nicely. When a timeout occurs, the client could do whatever it needs to and then start a new readline-with-timeout (or tell the server to switch modes, or otherwise resume the session). Of course, this would require any buffered data to be preserved on timeout.

IMAP IDLE is such a protocol, and the "cannot read from timed out object" exception bit me today while implementing it in imaplib. It looks like avoiding that exception will require me to implement custom readline(), read(), and buffering just for this protocol, adding complexity to the code, almost all of which will be duplicating what the standard library can already do. It sure would be nice to just keep using makefile() and relying on the stdlib's tried-and-true buffered read logic.

Is there any reason why the input buffer can't be preserved across calls? You've already got a file-like wrapper around the socket. Just keep the unconsumed buffer in that instance.

This would certainly make the stdlib more useful.

vadmium · 2024-12-13T08:17:07Z

Antoine’s Subversion r88622 commit is Git commit 68e5c04. I disagree with this change, because there are valid reasons to read a socket after a timeout or other exception, even if data was discarded in BufferedReader.

However I don’t think rolling it back would be enough for this IMAP case, which probably requires different timeout settings and handling for waiting for the start of a server response versus being in the middle of reading a response.

foresto · 2024-12-13T19:43:30Z

My point is simply that there are protocols in which a read timeout is a normal, common event. Not an error, let alone a fatal error. Preserving any buffered data for delivery on the next read attempt, and allowing the program to carry on, would be ideal.

Relatedly, I have encountered situations where it would have been helpful to be able to see whether there is buffered data.

beazley mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 14, 2009

gpshead self-assigned this Nov 14, 2009

pitrou unassigned gpshead Dec 31, 2010

pitrou closed this as completed Feb 25, 2011

bitdancer reopened this Oct 20, 2011

ezio-melotti transferred this issue from another repository Apr 10, 2022

axoroll7 mentioned this issue Aug 30, 2023

idle_check can see incomplete lines mjs/imapclient#519

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Socket timeout can cause file-like readline() method to lose data #51571

Socket timeout can cause file-like readline() method to lose data #51571

beazley mannequin commented Nov 14, 2009

beazley mannequin commented Nov 14, 2009

roysmith mannequin commented Nov 20, 2010

roysmith mannequin commented Nov 20, 2010

ned-deily commented Nov 21, 2010

roysmith mannequin commented Nov 21, 2010

ned-deily commented Nov 21, 2010

rosslagerwall mannequin commented Dec 31, 2010

pitrou commented Dec 31, 2010

rosslagerwall mannequin commented Jan 13, 2011

dabeaz mannequin commented Jan 13, 2011

pitrou commented Jan 13, 2011

pitrou commented Jan 13, 2011

gpshead commented Jan 13, 2011

pitrou commented Jan 13, 2011

rosslagerwall mannequin commented Jan 13, 2011

pitrou commented Jan 13, 2011

rosslagerwall mannequin commented Jan 14, 2011

pitrou commented Jan 14, 2011

dabeaz mannequin commented Jan 14, 2011

pitrou commented Feb 25, 2011

bitdancer commented Oct 20, 2011

pitrou commented Oct 20, 2011

bitdancer commented Oct 20, 2011

gpshead commented Oct 20, 2011

bitdancer commented Oct 20, 2011

dabeaz mannequin commented Oct 25, 2015

vadmium commented Oct 27, 2015

foresto commented Dec 13, 2024 •

edited

Loading

vadmium commented Dec 13, 2024

foresto commented Dec 13, 2024

Socket timeout can cause file-like readline() method to lose data #51571

Socket timeout can cause file-like readline() method to lose data #51571

Comments

beazley mannequin commented Nov 14, 2009

beazley mannequin commented Nov 14, 2009

roysmith mannequin commented Nov 20, 2010

roysmith mannequin commented Nov 20, 2010

ned-deily commented Nov 21, 2010

roysmith mannequin commented Nov 21, 2010

ned-deily commented Nov 21, 2010

rosslagerwall mannequin commented Dec 31, 2010

pitrou commented Dec 31, 2010

rosslagerwall mannequin commented Jan 13, 2011

dabeaz mannequin commented Jan 13, 2011

pitrou commented Jan 13, 2011

pitrou commented Jan 13, 2011

gpshead commented Jan 13, 2011

pitrou commented Jan 13, 2011

rosslagerwall mannequin commented Jan 13, 2011

pitrou commented Jan 13, 2011

rosslagerwall mannequin commented Jan 14, 2011

pitrou commented Jan 14, 2011

dabeaz mannequin commented Jan 14, 2011

pitrou commented Feb 25, 2011

bitdancer commented Oct 20, 2011

pitrou commented Oct 20, 2011

bitdancer commented Oct 20, 2011

gpshead commented Oct 20, 2011

bitdancer commented Oct 20, 2011

dabeaz mannequin commented Oct 25, 2015

vadmium commented Oct 27, 2015

foresto commented Dec 13, 2024 • edited Loading

vadmium commented Dec 13, 2024

foresto commented Dec 13, 2024

foresto commented Dec 13, 2024 •

edited

Loading