-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 'failed to get network during CreateEndpoint' #2554
Conversation
@xinfengliu thanks for solving this hairy issue ! would appreciate it if you could add a test case here or in https://github.com/moby/moby/blob/master/integration/network/network_test.go to recreate it |
@arkodg |
you could mimic docker/for-linux#888 (comment) via an integration test |
Thanks for the test case (docker/for-linux#888 (comment)). Sorry I didn't notice that. Your use case (restarting a container which is on an attachable overlay network and the container was started via 'docker restart' on an attachable overlay network will cause swarmkit I'll look into this to see if there's any solutions. |
I have fixed the error caused by your use case. Please see my updated notes in moby/moby#41011 about scenario 2. I will try adding tests in moby integration later time. |
network.go
Outdated
@@ -1181,7 +1181,8 @@ func (n *network) createEndpoint(name string, options ...EndpointOption) (Endpoi | |||
ep.locator = n.getController().clusterHostID() | |||
ep.network, err = ep.getNetworkFromStore() | |||
if err != nil { | |||
return nil, fmt.Errorf("failed to get network during CreateEndpoint: %v", err) | |||
logrus.Errorf("failed to get network during CreateEndpoint: %v", err) | |||
return nil, ErrNoSuchNetwork(n.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(left as a comment on the Moby PR as well)
I'd prefer ep.getNetworkFromStore()
to return this error-type.
While it may be true currently that this is always due to the network not being found, it may not be in future. I don't think we should make the assumption here that this is a ErrNoSuchNetwork
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking for possible other errors it could return, I noticed there some unused error-handling; opened #2556
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed return err in store.go
instead to make it a typed error.
@@ -1181,7 +1181,8 @@ func (n *network) createEndpoint(name string, options ...EndpointOption) (Endpoi | |||
ep.locator = n.getController().clusterHostID() | |||
ep.network, err = ep.getNetworkFromStore() | |||
if err != nil { | |||
return nil, fmt.Errorf("failed to get network during CreateEndpoint: %v", err) | |||
logrus.Errorf("failed to get network during CreateEndpoint: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also left as a comment on the moby PR);
I'm a bit on the fence if we should log this here, given that we're already "handling" the error by returning it. If We want the additional information, perhaps errors.Wrap(err, "failed to get network during createEndpoint")
(or something along those lines?).
Or perhaps more consistent with the other errors, and actually put this in the exported (CreateEndpoint
)
Open to suggestions though
Fix 'failed to get network during CreateEndpoint' during container starting. Change the error type to `libnetwork.ErrNoSuchNetwork`, so `Start()` in `daemon/cluster/executor/container/controller.go` will recreate the network. Signed-off-by: Xinfeng Liu <[email protected]>
86e8640
to
1df7f7e
Compare
@arkodg I have added test case for docker/for-linux#888 (comment) in moby/moby#41011 integration test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks !
I have reservations about #2554 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full diff: moby/libnetwork@2e24aed...9e99af2 - moby/libnetwork#2548 Add docker interfaces to firewalld docker zone - fixes docker/for-linux#957 DNS Not Resolving under Network [CentOS8] - fixes moby/libnetwork#2496 Port Forwarding does not work on RHEL 8 with Firewalld running with FirewallBackend=nftables - store.getNetworksFromStore() remove unused error return - moby/libnetwork#2554 Fix 'failed to get network during CreateEndpoint' - fixes/addresses docker/for-linux#888 failed to get network during CreateEndpoint - moby/libnetwork#2558 [master] bridge: disable IPv6 router advertisements - moby/libnetwork#2563 log error instead if disabling IPv6 router advertisement failed - fixes docker/for-linux#1033 Shouldn't be fatal: Unable to disable IPv6 router advertisement: open /proc/sys/net/ipv6/conf/docker0/accept_ra: read-only file system Signed-off-by: Sebastiaan van Stijn <[email protected]>
full diff: moby/libnetwork@2e24aed...9e99af2 - moby/libnetwork#2548 Add docker interfaces to firewalld docker zone - fixes docker/for-linux#957 DNS Not Resolving under Network [CentOS8] - fixes moby/libnetwork#2496 Port Forwarding does not work on RHEL 8 with Firewalld running with FirewallBackend=nftables - store.getNetworksFromStore() remove unused error return - moby/libnetwork#2554 Fix 'failed to get network during CreateEndpoint' - fixes/addresses docker/for-linux#888 failed to get network during CreateEndpoint - moby/libnetwork#2558 [master] bridge: disable IPv6 router advertisements - moby/libnetwork#2563 log error instead if disabling IPv6 router advertisement failed - fixes docker/for-linux#1033 Shouldn't be fatal: Unable to disable IPv6 router advertisement: open /proc/sys/net/ipv6/conf/docker0/accept_ra: read-only file system Signed-off-by: Sebastiaan van Stijn <[email protected]> Upstream-commit: 219e7e7ddcf5f0314578d2a517fc0832f03622c1 Component: engine
Fixes docker/for-linux#888
Fix 'failed to get network during CreateEndpoint' during container starting.
Change the error type to
libnetwork.ErrNoSuchNetwork
, soStart()
in docker enginedaemon/cluster/executor/container/controller.go
will recreate the network.Signed-off-by: Xinfeng Liu [email protected]