Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempesta hooks and proper use of close(), drop(), failover, and retry() #107

Closed
keshonok opened this issue Jun 11, 2015 · 5 comments
Closed
Assignees

Comments

@keshonok
Copy link
Contributor

There are several cases when failover() function is required.

  1. There's no server at the other end of connection;
  2. A server closed the connection (for instance, on timeout);

(1) is the only case that leads to kernel calling of sk->sk_error_report() socket hook. The kernel then calls sk->sk_state_change() hook in state TCP_CLOSE thus skipping the state TCP_ESTABLISHED. In current implementation of Synchronous Sockets the state TCP_CLOSE is never reached in the regular course of action. This ensures that when sk->sk_state_change() is called in TCP_CLOSE state that's because of a failed connect attempt due to the absence of a server at the other end. So currently that's when Tempesta hook connection_close() is called. The name is misleading, and in fact this hook leads to calling of tfw_sock_srv_connect_retry() that initiates reconnect attempts. This hook is not implemented for client sockets.

(2) is the case where we need to reconnect. Currently connections to servers are considered permanent, so when a connection closes Tempesta needs to reconnect. That's when connection_drop() hook is called which leads to calling tfw_sock_srv_connect_failover() for server sockets. This function releases all Tempesta resources linked to the socket, and initiates a new connect attempt to the server. The old socket is closed.

Socket close in Synchronous Sockets is implemented in two variants. One is internal ss_do_close() that is only called in SoftIRQ context from within Sync Sockets code. The other is external ss_close() that can be called from both SoftIRQ and process contexts. ss_close() calls ss_do_close to do the actual close. In that process ss_do_close() calls the connection_drop() hook. In case of server sockets that initiates another connect attempt.

The issue is in how these hooks and functions are used in Tempesta. The way it's used now it may lead to reconnect attempts where it's not needed, or even lead to an endless loop of these reconnect attempts in certain conditions.

All in all, it appears that it's not the right place to call connection_drop() hook from the internal close function ss_do_close(). It needs to be called differently and the right way.

Aside of the above two cases there's another one that makes the issue a bit more complicated. That's Tempesta start/stop process that is executed in process context. The stop process is especially troubling.

When Tempesta is stopped it needs to close all connections. That leads to reconnect attempts due to current code architecture. There's a need to differentiate between a socket close that should lead to another connect attempt, from a socket close that is intentional and final.

Also, when Tempesta is stopped, the release of Tempesta resources for a socket, and a socket close are performed on a live socket that may have data exchanged on it. So a socket close must be done in a correct way with proper locks in place.

@keshonok keshonok added the bug label Jun 11, 2015
@keshonok keshonok changed the title Tempesta hooks and proper user of close(), drop(), failover, and retry() Tempesta hooks and proper use of close(), drop(), failover, and retry() Jun 11, 2015
keshonok added a commit that referenced this issue Jun 11, 2015
This commit addresses issues described in #107.
- Run Tempesta hooks under sk->sk_callback_lock. The same lock protects
  the use and manipulation of sk->sk_user_data now. That way Tempesta
  can be separated from a kernel socket safely when socket is closed.
- Separate a release of Tempesta resources linked to a socket from
  a close of the socket itself. That makes proper use of failover,
  retry, and drop actions possible. Also, that makes possible proper
  reaction to a close by separating an intentional close by Tempesta
  from a close that requires a reconnect.
@keshonok
Copy link
Contributor Author

Implemented in df20f0a and 65bbdd9

@krizhanovsky krizhanovsky added this to the 0.4.0 Web Server milestone Jun 16, 2015
@krizhanovsky
Copy link
Contributor

Linked with commit & comment 33e7d3b

@keshonok
Copy link
Contributor Author

Below are some implementation notes.

sk->sk_user_data is used to access Tempesta protocol information, which includes Tempesta hooks (functions that perform certain actions on socket events), and various data related to current connection on the socket that is internal to Tempesta and opaque to Sync Sockets.

sk->sk_user_data is set up early on socket creation for listening sockets and backend server sockets, and it is never changed after that for the life of those sockets. Listening sockets are set up with sk->sk_state_change hook only to limit their functionality to a very specific role of accepting new connections. Those are the sockets that are explicitly created by Tempesta. For accepted new connections, sk->sk_user_data is inherited from a listening socket first, and then set up to a regular functionality, with a full set of Tempesta hooks and internal Tempesta data.

BTW, that change of sk->sk_user_data on accepted sockets could have presented a race condition, as Sync Sockets interface, and therefore Tempesta hooks, can be triggered by the kernel on different CPUs in parallel. The use and any change of sk->sk_user_data should be protected by a lock.

One of major tasks in implementation of this issue is separating Tempesta from the Linux kernel (Sync Sockets) on a specific socket when necessary. The kernel must continue to function correctly without any hiccup after the kernel socket is separated from Tempesta. After a recent architecture changes Tempesta has specific functions to link a kernel socket with Tempesta internal data for that socket, and unlink a kernel socket from Tempesta internal data for that socket. The functions are tfw_connection_link_sk() and tfw_connection_unlink_sk(). These functions assign sk->sk_user_data to point to a specific structure, or to NULL.

So, sk->sk_user_data is used in Sync Sockets to access and call Tempesta hooks, but it can be changed during the life of a socket. That means that access to it must be serialized and protected by a lock. A socket already has a lock sk->sk_callback_lock that should be used to access socket hooks used in implementation Sync Sockets. As those hooks are never changed during the life of a socket in Tempesta, there's no need to protect the hooks by a lock. sk->sk_user_data is used in functions attached to those socket hooks, and access to it needs to be protected, so it's a natural choice to use sk->sk_callback_lock for protection. It would have been used for socket callbacks anyway.

As soon as use and access to sk->sk_user_data is serialized, it's only a matter of checking sk->sk_user_data for NULL in Sync Sockets to see if Tempesta is now isolated from Sync Sockets on a specific socket. Sync Sockets continue to operate as usual on the socket, and any data in the socket previously linked to Tempesta will be eaten up successfully by Sync Socket, until the socket is closed soon after that. At the same time, Tempesta can safely release all resources linked with the socket.

@keshonok
Copy link
Contributor Author

Implemented in a series of commits included in pull request #110.

@keshonok
Copy link
Contributor Author

#110 merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants