Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: add support for socket options #12702

Merged
merged 2 commits into from
Mar 9, 2021
Merged

*: add support for socket options #12702

merged 2 commits into from
Mar 9, 2021

Conversation

hexfusion
Copy link
Contributor

@hexfusion hexfusion commented Feb 19, 2021

This PR adds support for setting socket options SO_REUSEADDR and SO_REUSEPORT to etcd listeners via ListenConfig. These options give the flexibility to cluster admins who wish to more explicit control of these features. What we have found is during etcd process restart there can be a considerable time waiting for the port to release as it is held open by TIME_WAIT which on many systems is 60s.

--socket-reuse-port: enables SO_REUSEPORT [1]  allows rebind of a port already in use. 
--socket-reuse-address: enables SO_REUSEADDR [1]  allows binding to an address in `TIME_WAIT` state. 
$ ps aux | grep etcd | wc -l
0

$  ss --numeric -o state time-wait 
tcp    0       0                 10.0.138.12:34642            10.0.138.12:2380   timer:(timewait,50sec,0)                                                       
tcp    0       0                 10.0.138.12:34470            10.0.138.12:2380   timer:(timewait,35sec,0)                                                       
tcp    0       0                 10.0.138.12:34086            10.0.138.12:2380   timer:(timewait,10sec,0)                                                       
tcp    0       0                 10.0.138.12:34232            10.0.138.12:2380   timer:(timewait,20sec,0)                                                       
tcp    0       0                 10.0.138.12:34694            10.0.138.12:2380   timer:(timewait,55sec,0)                                                       
tcp    0       0                 10.0.138.12:34332            10.0.138.12:2380   timer:(timewait,25sec,0)                                                       
tcp    0       0                 10.0.138.12:33966            10.0.138.12:2380   timer:(timewait,845ms,0)                                                       
tcp    0       0                 10.0.138.12:34040            10.0.138.12:2380   timer:(timewait,5.845ms,0)                                                     
tcp    0       0                 10.0.138.12:34184            10.0.138.12:2380   timer:(timewait,15sec,0)                                                       
tcp    0       0                 10.0.138.12:34570            10.0.138.12:2380   timer:(timewait,45sec,0)                                                       
tcp    0       0                 10.0.138.12:34512            10.0.138.12:2380   timer:(timewait,40sec,0)                                                       
tcp    0       0                 10.0.138.12:34398            10.0.138.12:2380   timer:(timewait,30sec,0)  

So we can wait for many seconds even after etcd process is long dead for client and peer ports to become available.

2021-02-19 13:57:25.117257 C | etcdmain: listen tcp 10.0.138.12:2380: bind: address already in use

A similar approach has been taken with the addition of these options in the kube-apiserver[2]

[1] https://man7.org/linux/man-pages/man7/socket.7.html
[2] kubernetes/kubernetes#93861

@gyuho
Copy link
Contributor

gyuho commented Feb 19, 2021

nice find!

@codecov-io
Copy link

Codecov Report

Merging #12702 (49078c6) into master (080453d) will increase coverage by 8.88%.
The diff coverage is 56.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #12702      +/-   ##
==========================================
+ Coverage   50.61%   59.50%   +8.88%     
==========================================
  Files         420      424       +4     
  Lines       33122    32541     -581     
==========================================
+ Hits        16765    19362    +2597     
+ Misses      14389    11348    -3041     
+ Partials     1968     1831     -137     
Flag Coverage Δ
all 59.50% <56.73%> (+8.88%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/transport/sockopt_unix.go 0.00% <0.00%> (ø)
server/etcdserver/config.go 74.24% <ø> (+34.84%) ⬆️
server/embed/etcd.go 75.23% <20.00%> (+3.05%) ⬆️
pkg/transport/sockopt.go 30.00% <30.00%> (ø)
pkg/transport/listener.go 9.91% <47.05%> (+3.30%) ⬆️
pkg/transport/timeout_listener.go 47.61% <50.00%> (-2.39%) ⬇️
.../rafttest/interaction_env_handler_process_ready.go 71.15% <50.00%> (-6.12%) ⬇️
raft/log.go 75.29% <55.55%> (-1.11%) ⬇️
raft/raft.go 89.68% <89.47%> (-0.07%) ⬇️
raft/rafttest/interaction_env.go 100.00% <100.00%> (ø)
... and 178 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1302e1e...49078c6. Read the comment docs.

@gyuho
Copy link
Contributor

gyuho commented Feb 23, 2021

@hexfusion Any updates? Would like to get this in release 3.5

@pires
Copy link

pires commented Feb 25, 2021

cc @jdef

@hexfusion
Copy link
Contributor Author

@hexfusion Any updates? Would like to get this in release 3.5

@gyuho I will have an update early next week latest

Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion
Copy link
Contributor Author

@gyuho PTAL added fixup commit will rebase once the review is done.

@gyuho gyuho added this to the etcd-v3.5 milestone Mar 9, 2021
@gyuho
Copy link
Contributor

gyuho commented Mar 9, 2021

@hexfusion Thanks! Can we also update our 3.5 CHANGELOG?

@Quentin-M
Copy link
Contributor

This is amazing, thanks for doing this!

@vivekpatani
Copy link
Contributor

Hi, the integration test cluster seems to not have access to this option, was that deliberate or is it something that we should consider adding? @serathius @ahrtr @gyuho

@ahrtr
Copy link
Member

ahrtr commented Aug 19, 2022

Hi, the integration test cluster seems to not have access to this option, was that deliberate or is it something that we should consider adding? @serathius @ahrtr @gyuho

Please try to add common config items so that they can be reused by both integration and e2e test cases.

@vivekpatani
Copy link
Contributor

Sounds good, I'll work on this, since when I tried working on tests for my other patch (#14355), and I was restarting the cluster, I ran into this issue, would be useful to have. Thanks, will take this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants