Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: modem: ublox-sara-r4: fix socket descriptor leak #28825

Closed
wants to merge 1 commit into from
Closed

drivers: modem: ublox-sara-r4: fix socket descriptor leak #28825

wants to merge 1 commit into from

Conversation

aport
Copy link
Contributor

@aport aport commented Sep 30, 2020

A TCP socket is only closed with +USOCL if the socket is in the
connected state. It is valid to call close() on a socket which is
not connected, but in this case the underlying descriptor in the modem
will not be released. This leads to a leak in sockets and eventual
exhaustion.

Issuing the +USOCL command unconditionally lets us avoid corner cases
and allows for the module to check validity.

Addresses #26819

Signed-off-by: Adam Porter [email protected]

Copy link
Contributor

@mike-scott mike-scott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aport Thank you for submitting this.

NULL, 0U, buf,
&mdata.sem_response, MDM_CMD_TIMEOUT);
if (ret < 0) {
LOG_ERR("%s ret:%d", log_strdup(buf), ret);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should make this a LOG_WARN now? I feel like users may see this error more often if we do an unconditional +USOCL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that change. One error condition being prevented here was issuing a +USOCL for a socket which was already closed by the remote peer. This wouldn't be unusual to encounter in a ported POSIX application so a lower log level sounds good to me.

We could also change the code to have something like an is_allocated or is_valid flag so we can catch that case but still allow sending +USOCL on an unconnected socket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the log should not be displayed when we already now that the socket has been closed, and that we made the call only to avoid a leak in sockets in the modem internals.

A TCP socket is only closed with +USOCL if the socket is in the
connected state. It is valid to call `close()` on a socket which is
not connected, but in this case the underlying descriptor in the modem
will not be released. This leads to a leak in sockets and eventual
exhaustion.

Issuing the +USOCL command unconditionally lets us avoid corner cases
and allows for the module to check validity.

Signed-off-by: Adam Porter <[email protected]>
@XavierChapron
Copy link
Contributor

@aport I'm not sure to understand, you mean that socket are leaking in the internal of the ublox sara modem (not its zephyr driver)?

@aport
Copy link
Contributor Author

aport commented Oct 6, 2020

@XavierChapron The issue is the Zephyr driver preventing the transmission of +USOCL command when a TCP socket is not connected. In POSIX you can call close() on a socket descriptor at any time, even before calling connect().

In my latest round of testing, I can not reproduce the original bug where the SARA does not release a socket when +USOCO fails. I'll need to do some more testing. However as mentioned above, there's still a bug in the Zephyr driver that should be fixed.

I will change the logic in this PR so that +USOCL is only sent to the modem if the socket has not been explicitly closed already. This will require a new flag separate from is_connected.

To summarize the issue:

  1. Create a new TCP socket (driver sends USOCR=6)
  2. Note that is_connected is false.
  3. Close the socket.
  4. Note that +USOCL is not sent because is_connected is false.

If you repeat this seven times the modem will exhaust its internal socket pool and you will be unable to create new sockets.

@XavierChapron
Copy link
Contributor

@aport Got it! Thanks for the explanation, I previously missed the part about calling close() without any connect() before it.

@aport
Copy link
Contributor Author

aport commented Oct 6, 2020

I'm confused at the moment... I may close this PR until I have more time to sit down and look deeper into the issue.

Reading the code, +USOCR is only called from bind() or connect(), and not from socket()? If this is true, then calling socket() and then close() will not cause any leak. But if I'm following this correctly, this means that +USOCR is never called for a UDP client socket? Is this functionality broken or did I overlook something?

@github-actions
Copy link

github-actions bot commented Dec 6, 2020

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Dec 6, 2020
@github-actions github-actions bot closed this Dec 20, 2020
@aport aport deleted the sara-fix-socket-close branch December 21, 2021 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants