forked from RichardSharpe/smbdirect-driver
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdesign.txt
175 lines (124 loc) · 6.77 KB
/
design.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
Design of the SMB Direct Kernel Module for Samba
Richard Sharpe, 21-Aug-2013 and updated on 25-Aug-2013
Ralph Boehme, August 2016
INTRODUCTION
When Windows uses SMB Direct, or SMB over RDMA, it does so in a way that is
not easy to integrate into Samba as it exists today.
Samba uses a forking model of handling connections from Windows clients. The
master smbd listens for new connections and forks a new smbd before handling
any SMB PDUs. It is the new smbd process that handles all PDUs on the new
connection.
Please see the documents [MS-SMB2].pdf and [MS-SMBD].pdf for more details about
SMB Over additional channels and the SMB Direct protocol. However, in brief,
what happens is the following:
1. The client establishes an SMB connection over TCP to a server. For Samba,
this involves the forking of a new process.
2. The client NEGOTIATES the protocol and then does a SESSION SETUP. If this
is successful, the client now has a Session ID it will use in establishing
additional channels, including any via SMB Direct (RDMA).
3. The client uses a TREE CONNECT request to connect to a share.
4. The client issues an FSCTL_QUERY_NETWORK_INTERFACE_INFO IOCTL to determine
what interfaces are available.
5. If there are any RDMA interfaces in common between the client and the
server, and the server supports MULTI_CHANNEL, the client initiates an
RDMA connection to the server.
6. The client then sends a NEGOTIATE requesting SMB3.0 and above as well as
support for MULTI_CHANNEL.
7. It that succeeded, the client then sends a SESSION_SETUP and specifies
SMB2_SESSION_FLAG_BINDING along with the Session ID obtained on the first
connection.
At this point, we now have an RDMA channel between the client and server.
SMB Direct actually involves a small protocol but the details are not relevant
here and can be read about in [MS-SMBD].pdf.
The RDMA connections have to be terminated in a single process (as only one
process can listen on port 5445) but then they would have to be transferred to
the process that should control that connection.
We were told by Mellanox folks that there is no real support for transferring
all the RDMA state between processes at this stage.
Another approach would be to have a single process responsible for all RDMA
handling and have the smbd's communicate with that process about new incoming
connections and reads and writes. While this would work, and could eliminate
multiple copies of the data with shared memory, it would involve a context
switch for most if not all RDMA transfers.
A LINUX KERNEL SMB DIRECT MODULE
An alternative model is to develop a Linux kernel driver to handle RDMA
connections.
While this approach locks us into Linux for the moment, it seems to be a
useful alternative.
It would function somewhat like this:
The smbdirect device driver would be a character device driver and would be
loaded after any drivers for the RDMA cards and ipoib.
When Samba starts, it would attempt to open the device driver, and if
successful, would call an ioctl() to set the basic SMB Direct parameters, as
explained below. This would allow the smbdirect driver to start listening for
RDMA packets on the SMB Direct TCP port.
The main Samba smbd process can now poll the device driver fd. The driver will
signal readability when a new SMB-D connection was established in the kernel. An
fd for the new connection is created by the driver and can be fetched from
userland via another ioctl().
The smbd process will create new child to handle the session. The SMB protocol
engine in the child smbd process can use the fd just like normal socket fds
acquired via accept() with readv()/writev() IO syscall. However, it can't use
socket function like sendmsg()/recvmsg() as the fd is for a file handle, not a
socket.
In case the RDMA connection is an additional channel for an existing SMB
session, the smbd process will pass the connection fd to another smbd process at
the NEGPROT stage based on the client guid. This is no different from any other
multi-channel connection passing in Samba. In fact, the whole protocol engine
MUST work exactly the same for any connection fd, be it a traditional socket fd
for a TCP/IP connection or a file fd open on the smbdirect character device.
STEPS TO BE TAKEN BY SAMBA
1. The main smbd opens the smbdirect character device file and uses a set of ioctl()
to tell the kernel driver on which interfaces to listen for SMB-D.
2. With another ioctl() it registers a set of parameters relevant to the SMB-D
protocol itself:
- ReceiveCreditsMax
- SendCreditMax
- MaxSendSize
- MaxFragmentSize
- MaxReceiveSize
- KeepAliveInterval
3. The character device fd is monitored by tevent using tevent_add_fd for READ
events. When the fd reports readability, a new SMB-D connection fd is fetched
with another ioctl(). Further connections processing is exactly the same as
for a TCP/IP connection with a socket fd returned from accept().
4. When an FSCTL_QUERY_NETWORK_INTERFACE_INFO ioctl request is received, it will
respond with the IP address(es) of all the RDMA interfaces as specified in
[MS-SMB2].pdf.
5. Processing of SMB PDUs from the connection fd uses the unmodified protocol
engine.
6. Processing of RDMA READs or RDMA WRITEs requires additional protocol
machinery. After forking, the session smbd process will perform the following
actions:
a) Call an IOCTL to retrieve the shared memory parameters (typically) the
size of the shared memory region required.
b) Call mmap on the device to mmap the shared memory region that allows us to
avoid copying large amounts of data between userspace and the kernel.
7. When a LOGOFF is received on the smbdirect connection, a response will be
sent. Once that has completed, the device will be closed, which will cause
the RDMA connection to be dropped.
REQUIREMENTS OF THE DRIVER
When the driver loads it will begin listening for incoming RDMA connections to
the interfaces specified by the main smbd process, or possibly all interfaces.
When PDUs are available for the smbd to process, or when RDMA READ or WRITE
operations have completed (and possibly when PDU SENDs have completed) the
device will set in a POLLIN state so that the smbd can process the new events.
IOCTLS
The following ioctls are needed:
1. SMBD_LISTEN
This ioctl sets the set of parameters that SMB Direct operates under:
- ReceiveCreditMax
- SendCreditMax
- MaxSendSize
- MaxFragmentSize
- MaxReceiveSize
- KeepaliveInterval
2. SMBD_ACCEPT
This fetches a new connection fd.
3. RDMA_READ_WRITE
This ioctl takes a set of shared memory areas as well as remote memory
descriptors and schedules RDMA READs or RDMA WRITEs as needed.
Each memory region is registered prior to the RDMA operation and unregistered
after the RDMA operation.
3. SET_SMBD_DISCONNECT.
Not sure if this is needed.