stricter error checking for select() interface #40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR: The
select()
polling interface mistakenly assumes that a file descriptor being present in the exceptions FD set indicates an error, and throws an uncaught exception which terminates the program. This patch checks to see if there is indeed an error on the socket associated with that file descriptor, and only throws the exception if there is one, along with more descriptive information as to what the error is.So in rakshasa/rtorrent#51 Solaris derivative users say that after a while rtorrent simply terminates/exits. Discussion there seemed to talk about it being signal related, so I provided pull request #127. User @lotheac confirmed a few months later saying that it did indeed fix one of his problems. However, after extended use he ran into a similar problem with the error message "Listener port received an error event," an error message that I tracked down to libtorrent.
Solaris derivatives don't use an OS-specific I/O multiplexing API such as
/dev/ports
or event ports, instead falling back to the simpleselect()
API.The source of the problem is what I believe to be a common misinterpretation of the
select()
function, whose prototype is:However, some manual pages such as Solaris' put it this way:
Notice that the fourth argument is shown there as
errorfds
instead ofexceptfds
as shown on the Linux man pages. This naming discrepancy is common across API documentations, and it mistakenly gives the impression that file descriptors present in that set indicate that an I/O error has occurred on that file descriptor. However, this is not necessarily the case, as is outlined inselect_tut(2)
:So this says that it's usually indicative of out-of-band data being present or a certain condition in pseudoterminals in packet mode. Skimming through the source I didn't find any instance in which libtorrent sends out-of-band data, and it doesn't use pseudoterminals as far as I'm aware. Considering that this problem only shows itself on Solaris derivatives, I figure it's a Solaris' platform-specific situation in which it's perhaps more relaxed about what it considers to be an "exceptional condition."
The Solaris man page for
select()
says:So I added a check to retrieve the error code associated with the socket pertaining to that file descriptor. If there is indeed an error, then follow through with throwing the exception along with descriptive text regarding what the error is. If not, then continue on normally.
User @lotheac applied the patch along with rakshasa/rtorrent#51 and tested it for a few days before reporting back that everything appeared to be working fine.