-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updates to containerd executor, including basic CNI based network support #2299
Conversation
manager/network/model.go
Outdated
Allocator() (networkallocator.NetworkAllocator, error) | ||
SetDefaults(spec *api.NetworkSpec) error | ||
ValidateNetworkSpec(spec *api.NetworkSpec) error | ||
PredefinedNetworks() []networkallocator.PredefinedNetworkData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the definition of PredefinedNetworkData
belongs in this package rather than networkallocator
.
We can't move it on its own because that would cause a circular dependency.
I wonder if it would also make sense to move the NetworkAllocator
interface definition here. Then this package would contain the interface definitions, and networkallocator
would have the implementations.
I'm not completely sure either way, just throwing this out as an idea. Something doesn't feel quite right about this package depending on the allocator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point/idea. I'll try it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually PredefinedNetworkData
isn't used anywhere in the network allocator package after the changes in this PR so moving it makes absolute sense and doesn't cause a circular dependency.
I'm still considering possibly moving networkallocator.NetworkAllocator
to network.Allocator
, I think I like it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronlehmann done. I also refactored the rest of manager/allocator/networkallocator
and a few bits of manager/allocator/network.go
out into various more suitable places, so the packages which depend on the allocator are now (IMHO at least) much cleaner.
manager/network/model.go
Outdated
SetDefaults(spec *api.NetworkSpec) error | ||
ValidateNetworkSpec(spec *api.NetworkSpec) error | ||
PredefinedNetworks() []networkallocator.PredefinedNetworkData | ||
SupportIngressNetwork() bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SupportsIngressNetwork
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to do this is only define an IngressNetwork
method on implementations that support the ingress network, and type-assert to check whether a particular implementation has this method or not. This keeps implementation-specific details from creeping into the main interface definition. It can be more cumbersome to use, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronlehmann I made the name change as an interim measure. but WRT the IngressNetwork
method I think it might make sense to move the newIngressNetwork
function into the cnm version of that and use the type check WDYT?
I'm also eyeing up newPredefinedNetwork
in a similar light, in essence making PredefinedNetworks
return an api.Network
instead of the PredefinedNetworkData
and so embedding the function into CNM where it is used.
manager/network/model.go
Outdated
|
||
// Model is an abstraction over the Network Model to be used. | ||
type Model interface { | ||
Allocator() (networkallocator.NetworkAllocator, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this NewAllocator
.
agent/exec/containerd/controller.go
Outdated
continue | ||
} | ||
for _, ip := range ns.cni.IPs { | ||
log.G(ctx).Debugf("Iface# %d, num interfaces %d\n", ip.Interface, len(ns.cni.Interfaces)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no \n
.
manager/allocator/allocator_test.go
Outdated
netWatch, cancel := state.Watch(s.WatchQueue(), api.EventUpdateNetwork{}, api.EventDeleteNetwork{}) | ||
defer cancel() | ||
taskWatch, cancel := state.Watch(s.WatchQueue(), api.EventUpdateTask{}, api.EventDeleteTask{}) | ||
defer cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this works as intended, but it might clarify the code to avoid reusing the name cancel
.
Codecov Report
@@ Coverage Diff @@
## master #2299 +/- ##
==========================================
- Coverage 60.38% 60.22% -0.17%
==========================================
Files 128 130 +2
Lines 26056 26122 +66
==========================================
- Hits 15735 15731 -4
- Misses 8934 9006 +72
+ Partials 1387 1385 -2 |
395c9a4
to
d01cd54
Compare
Addressed all comments except #2299 (review) and #2299 (review) which GH has helpfully decided are old due to my touching those lines for other reasons. Leaving this comment as a reminder to self. |
f6eaabb
to
7f80d62
Compare
api/network.go
Outdated
package api | ||
|
||
// IsIngressNetwork check if the network is an ingress network | ||
func IsIngressNetwork(nw *Network) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you put these helper functions in an api/network
package? We've tried to reserve api
for autogenerated code, though there are some exceptions now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think github.com/docker/swarmkit/manager/network
would be a fine place to put these. In particular, IsIngressNetworkNeeded
is basically business logic, so the manager/network
package might make more sense.
Mostly LGTM. Left a few comments. |
api/types.proto
Outdated
string interface = 2; | ||
|
||
// Address is the IP address in the form ipv4:XXX or ipv6:YYY | ||
string address = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would make sense to break apart the protocol and address into separate fields.
If this format is borrowed from CNI and is well-defined there, this is probably fine, but in general I like to avoid microformats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I did indeed make this up myself.
api.EndPoint.VirtuallIP.addr
is a string. Looking at the usage it appears to be defined as one which is compatible with net.ParseCIDR
/net.ParseIP
. This seems like a reasonable enough precedent to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to this I don't really like what I've done wrt error reporting at https://github.com/ijc/swarmkit/blob/9c619b516e62f103df460dfc8ffd9652634733ac/agent/exec/containerd/controller.go#L66, essentially it just uses an "error:" prefix to distinguish.
I think maybe it would be better to have a separate error field, probably as a oneof
.
manager/network/cnm/cnm.go
Outdated
pg plugingetter.PluginGetter | ||
} | ||
|
||
// New produces a fresh CNI Network Model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CNM
manager/network/cnm/cnm.go
Outdated
if cnmallocator.IsBuiltInDriver(driver.Name) { | ||
return nil | ||
} | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the empty default
case.
I've now addressed the outstanding review (sorry for the delay), I intend to fold those changes back into the right place in the course of the rebase which I need to do also. I'm also going to pull in an update containerd v1.0.0-alpha1 (or later if there's another release in the meantime) at the same time. |
240e50a
to
8a09c03
Compare
Rebased. Only minor hiccup was the need to plumb a |
The PR has been rebased to containerd v1.0.0-alpha1. Signed-off-by: Ian Campbell <[email protected]>
agent/exec/containerd/adapter.go
Outdated
@@ -193,6 +214,29 @@ func (c *containerAdapter) isPrepared() bool { | |||
return c.container != nil && c.task != nil | |||
} | |||
|
|||
func cniConfig(n *api.Network) (*libcni.NetworkConfig, error) { | |||
if n.DriverState.Name != "cni" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could DriverState
be nil here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cni.ValidateNetworkSpec
would have rejected it if it were, and cniNetworkAllocator.Allocate
ensures (redundantly) that it is not, but I think I shouldn't rely on the behaviour of such distant components so I will add a check here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, cni.ValidateNetworkSpec
is looking at DriverConfig
not DriverState
, so cniNetworkAllocator.Allocate
isn't redundant, I should still add the check here though.
manager/network/cni/cni.go
Outdated
} | ||
|
||
if spec.Annotations.Name == "" { | ||
spec.Annotations.Name = cni.Network.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed this bit. It's sort of problematic - controlapi shouldn't modify the spec, which should only be user data. I can't think of any horrible implications from this particular statement, but we've worked pretty hard to keep a clear separation between user-controlled data and fields that are modifiable by swarmkit, and I wouldn't want to introduce an exception without a very good reason. Among other things, we're thinking of adding a mechanism for signing specs before they're submitted to swarmkit, in the future.
Could we remove SetDefaults
and instead dynamically resolve an emptyspec.Annotations.Name
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could have far reaching implications, for example places like validateAnnotations
would need a special case which would depend on the type of spec the annotation was embedded in and we would need to arrange that uses of the network spec always got their name via a wrapper function instead of going to it directly.
There's also the overhead of having to parse the JSON each time (although I have an open plan to try caching that).
I was hoping to avoid requiring the user to specify both spec.Annotations.Name
and the cni name identically, but perhaps I should flip this around and instead allow the cni name to be empty and fill it in from the (required) annotation name. That would "just" require me to figure out how to round trip an arbitrary JSON object through a parser to add a field without knowing in advance all of the fields which are present. I'm not actually sure if the go json library can do that, but I'll investigate.
Or another alternative would be to add a Name
field to the api.Network
object which contains the canonical name as determined by looking at the spec and the cni. AIUI that object is, unlike the spec, internal and fair game for mutating.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to avoid requiring the user to specify both spec.Annotations.Name and the cni name identically
I think it's not so bad to require the user to specify spec.Annotations.Name
. I think we require non-empty values for some object types.
I'm not sure I understand why spec.Annotations.Name
would have to match the CNI name, but if they do, enforcing this constraint in controlapi validation sounds okay to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why spec.Annotations.Name would have to match the CNI name
Just to avoid confusion since the CNI name might get used where the admin might see it e.g. the bridge name (I think) and having that match swarmd's idea of the network's name seems useful.
Maybe it's not worth enforcing it though, if an admin wants to confuse themself that's up to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect client software (i.e. Docker) to use the same value in each field in practice. I don't think it's necessary to enforce it at the swarmkit level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okey dokey, it certainly simplifies my life here ;-)
@@ -465,12 +465,31 @@ enum TaskState { | |||
// Remove this message when the states are deemed perfect. | |||
} | |||
|
|||
// State of Networks attached to a container | |||
message ContainerNetworkStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this would be used in flat list, where the same values of network
and interface
may be repeated to provide multiple addresses.
If one of the networks is showing an error, there would be a single entry for it with an error instead of an IP.
This seems different from the way networks are represented internally in the executor. There's a set of networks at the top level, and each network can have zero or more IPs underneath it. Would it make sense to replicate that hierarchy here? My first thought is it would be a little less confusing than the flattened list here with the oneof
choosing between an IP or an error condition, but I'm not really familiar with CNI, so it's very possible I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have literally no idea why I chose to flatten this in the way I did, I probably started down one path before I had fully groked the CNI data structures and never reevaluated the base decision.
I'll do as you suggest and unflatten it making it a (non-oneof
) string error
and repeated string ip
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is a bit more complicated than I had remembered. The result of a single CNI add call (corresponding to a single api.Tasks.Network
) is a Result struct which contains (among other things) an array of Interfaces
(name, sandbox, max) and an array of IPs (4 vs 6, reference to the Interfaces array, network and gateway). There's also routes and dns in there. There's a (brief) example under the "Result" heading in the spec.
There isn't a requirement that len(Interfaces) == 1
, i.e. a single CNI add call might produce multiple interfaces. It's also possible that len(Interfaces) < len(IPs)
, i.e. that a single interface might be given multiple IP addresses. There's also more information here (gateways, routes etc) than I have exposed so far.
I'm not really sure how best to approach this. I'm leaning towards a more hierarchical representation of the raw CNI (in particular dereferencing the interface array index) and leaving out the other fields I mentioned for now (they can be added later easily enough). So something of the form:
message ContainerNetworkStatus {
// Network is the Network ID
string network = 1;
// Error, if any.
string error = 2;
message NetworkInterface {
string name = ??;
// Each IP is in dotted decimal ("192.0.2.1") or IPv6 ("2001:db8::68") form.
repeated string ip = ?? [(gogoproto.customname) = "IPs"];
};
repeated NetworkInterface interfaces = ??;
}
While I was here I did spot that my construction of an interface name (eth%d
) to pass to the CNI add call was unnecessary (and probably wrong). The spec requires the plugin to use that name if it is given, but if not can choose its own name(s), which seems like the correct approach to use (and is reflected in the state above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal makes sense to me. Again, I'm not really familiar with CNI, but this is probably what I'd come up with if asked to abstractly describe network interfaces and their IPs.
Review has been addressed. |
api/types.proto
Outdated
string error = 4; | ||
} | ||
// Each IP is in dotted decimal ("192.0.2.1") or IPv6 ("2001:db8::68") form. | ||
repeated string ip = 2 [(gogoproto.customname) = "IPs"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom name isn't consistent with the protobuf name - only one is plural.
I think I'd just go with IP
.
Signed-off-by: Ian Campbell <[email protected]>
Currently the implementation is the existing CNM provider. Since plugingetter is in effect CNM specific encapsulate it into the CNM implementation of network.Model instead of passing it around. This means moving validate driver (which was only used for networking purposes) down into the interface too. This also allows moves the caller of allocator.IsBuiltInNetworkDriver to the CNM model, which can therefore call the cnmallocator directly. One downside of this approach is that while the plugingetter was almost always optional the network model is not, meaning one needs to be added to numerous test cases. The uses of plugingetter are now confined to the CNM model and allocator code rather than being exposed to various parts of the manager and node, which seems cleaner. Signed-off-by: Ian Campbell <[email protected]>
and move the CNM implementation into the CNM model. Signed-off-by: Ian Campbell <[email protected]>
Add a single simplistic test case for the allocator, when in CNI we expect no allocations other than the population of the NetworkAttachment fields. The allocator is essentially a nop, except for populating n.DriverState. Select this model when in containerd mode. Signed-off-by: Ian Campbell <[email protected]>
Ingress networking is (AFAIK) not covered by CNI and would be provider/plugin specific (or involve chained plugins), so there is no sane configuration we can inject by default. Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
This synchronises a few other things with containerd's vendor.conf, image-spec is containerd executor specific. But github.com/opencontainers/runc and golang.org/x/sys are used elsewhere. For reference: opencontainers/runc@b6b70e5...429a538 golang/sys@5eaf0df...fb4cac3 No changes were needed to the non-vendored code. Signed-off-by: Ian Campbell <[email protected]>
Also add a SetDefaults which avoids updating the spec in the validation function in the CNI case. With this all uses of libnetwork are in manager/allocator/cnmallocator and manager/network/cnm/cnm.go while all uses of CNI are in manager/network/cni/cni.go and agent/exec/containerd/adapter.go. I'd still like to abstract the containerd/adaptor.go case further. Signed-off-by: Ian Campbell <[email protected]>
Although the current code is not wrong it is clearer to use a specific name. Signed-off-by: Ian Campbell <[email protected]>
Going to put some non-model stuff in here, not worth a second file. Signed-off-by: Ian Campbell <[email protected]>
So manager/allocator/networkallocator.NetworkAllocator becomes manager/network.Allocator. Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
This makes manager/allocator/networkallocator obsolete. Remove the thin wrappers around these functions from allocator Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
Only a small changes to error handling required. Also bump some dependent vendorings. golang.org/x/sys update may not be strictly necessary. Signed-off-by: Ian Campbell <[email protected]>
No meaningful difference to either, but good to be on the real releases! Signed-off-by: Ian Campbell <[email protected]>
cniNetworkAllocator.Allocate should have ensured there is always a DriverState, but in case not (e.g. we end up unexpectedly running with a different network allocator) check here too. Signed-off-by: Ian Campbell <[email protected]>
Structure as Network->[]Interfaces->[]IPs, which more closely reflects reality and is also closer to the CNI model. While reworking this I observed the CNI_IFNAME environment variable (AKA libcni.RuntimeConf.IfName) is optional, however with at least the weave-net plugin (and looking at the CNI repo anything based on the skeleton there) insists on it. Relatedly I also see that the Result lacks an Interfaces array and that the IP addresses reference Interface==-1. Handle this in the code by creating a fake Interface if none is returned (using the network if name) and using it if the index is out of bounds. Signed-off-by: Ian Campbell <[email protected]>
The spec should be readonly to swarmkit. Instead push this requirement onto the user. This is already checked by `cni.ValidateNetworkSpec` but relax this into just a warning, so admins have the flexibility to have a different CNI name to swarm name. However log the mismatch and pass a context to `network.Model.ValidateNetworkSpec()` to facilitate that. Signed-off-by: Ian Campbell <[email protected]>
Force pushed to resolve conflict with 82c36fe. |
Ah, I remember why I hadn't done alpha2 (or now alpha3), it's the logrus change. I'll revisit that aspect once #2334 has landed. I would expect that merging this PR would make #2334 a bit easier since the delta from alpha1 (here) to alpha2,3,etc should be small since this code uses the client library. I'll do a fake/local rebase of this onto #2334 to check that theory. /cc @dmcgowan. |
@ijc What's current status and plan? |
I'm afraid I've moved onto other things and have no current plans to pursue this further. |
It targets an old pre v1.0 alpha version of containerd and is not currently maintained. closes moby#2219, closes moby#2258, closes moby#2259 and closes moby#2299. The vendoring of libtrust did not become obsolete here but vndr noted that it was unused so I dropped it while I was there. Signed-off-by: Ian Campbell <[email protected]>
It targets an old pre v1.0 alpha version of containerd and is not currently maintained. closes moby#2219, closes moby#2258, closes moby#2259 and closes moby#2299. The vendoring of libtrust did not become obsolete here but vndr noted that it was unused so I dropped it while I was there. Signed-off-by: Ian Campbell <[email protected]>
Apart from actually adding basic CNI support to the containerd executor the largest non-vendoring change here is to refactor some of the network operations into an interface (initially proposed in #2230 (comment)) in order to provide both CNM and CNI integration.
One nice IMHO result of that refactoring is that all imports of
libnetwork
are now inmanager/allocator/cnmallocator
andmanager/network/cnm
(previously some were also incontrolapi
, which seemed a bit like a layering violation) while all uses of CNI are inmanager/network/cni
andagent/exec/containerd/adapter.go
. I'd still like to abstract theadaptor.go
uses further to remove those.I think there are likely still various bits of
manager/allocator/cnmallocator
which might more properly belong inmanager/network/cnm
(I'm thinking the use of the driver registry for example) but I haven't pulled on that thread more than the amount minimally required here.The revendoring in
Update containerd to xxxxx
updates a few things which are used outside of the containerd plugin, specificallyjackfan.us.kg/opencontainers/runc
andgolang.org/x/sys
. Both look OK to me but I'm not sure if there is some interaction with e.g. vendoring into moby/moby to worry about (especially runc).The result of this that basic IP level connectivity is made to containers, I've not sorted out thinkgs like service discovery (e.g. DNS) yet.
In terms of testing I've been using the swarmd project in LinuxKit to fire up a cluster of VMs running Weave Net based CNI networking. A network is created with
swarmctl network create --cni '{ "name": "weave", "type": "weave-net" }'
and thenservice create ... --network weave
can be used in the normal way.