Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubectl can apply config files in bulk #1905

Closed
smarterclayton opened this issue Oct 20, 2014 · 32 comments
Closed

Kubectl can apply config files in bulk #1905

smarterclayton opened this issue Oct 20, 2014 · 32 comments
Labels
area/app-lifecycle area/client-libraries area/kubectl area/usability priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@smarterclayton
Copy link
Contributor

Extracted from #1325

  • Postponed submit [<filename>...] - This command exists to satisfy the second use case above. Reconcile and submit any changes from a given set of config(s), from files or from stdin. If you had an entire directory tree of files that had configs that represented your cluster state, you could use this command to submit them all to either create or update your cluster.
  • Postponed diff [<filename>...] - A dry-run version of submit.

Also relevant #1007, #113, #1702, #1704, OpenShift config and OpenShift apply command, and #987 which was the first attempt at this. Also see issues with config-deployment label.

I'd like to be able to represent a set of potentially unknown config objects inside a JSON file that can be applied en-masse - we currently almost all the necessary support in the encode/decode paths to be able to take an api versioned list, and attempt to extract unknown objects as JSON and if necessary post them to an endpoint directly.

The client-flexible version of this is dependent on #1355 in order to be able to find the endpoints for resources that are not pod, controller, or service.

@smarterclayton
Copy link
Contributor Author

@ghodss @mfojtik @VojtechVitek @bgrant0607 wanted to pull this out into a follow on issue from 1325 - we have a somewhat working implementation of this in OpenShift and we'd like to continue the discussion here now that we've sorted through most of the painful issues.

@bgrant0607
Copy link
Member

see also #113 , #1702, #1704, and probably other things labeled https://github.com/GoogleCloudPlatform/kubernetes/labels/area/config-deployment

@brendandburns
Copy link
Contributor

I would really like to see this get built, and possibly

kubectl submit <directory>

as well.

@bgrant0607
Copy link
Member

In #113, Clayton wrote: Namespace is inferred on creation - should it also be inferred on deletion? We don't have auto-naming yet, so we don't have unnamed resources.

@bgrant0607
Copy link
Member

Let's start simple and iterate. Since the API plugin support isn't ready yet, let's start without it. Let's also start without client-side plugin configuration generators (e.g., #1695) and Dockerfile deployment (#1294).

Let's use #1702 for basic diff and update, #1353 for rolling update, #1704 for more sophisticated deployment workflow features, and this one for basic up/down.

We can start with one file and/or stdin, and then extend to directories, file selection globs, object-kind selection (e.g., just services), and other operations (e.g., update, restart).

Note that while I think we should make kubectl capable of performing more sophisticated deployment with a single command, we also need to ensure that the individual functionality we integrate can be used a la carte in a composable fashion driven by a build process (#1694), workflows/scripts including non-Kubernetes-specific deployment steps (#1704), and higher-level systems (e.g., OpenShift), either as libraries or by invoking the kubectl tool. Monolithic all-in-one configuration/deployment tools have caused us big problems internally.

Regarding inferring namespace, yes, it would be inferred on deletion, also. I could also imagine a command-line flag to auto-populate namespace. #1698 addresses the broader issue of name/label/selector scoping to facilitate configuration template reuse across multiple deployments.

@smarterclayton
Copy link
Contributor Author

A proposed set of verbs that clients might perform on a list of resources (henceforth referred to a config, which might be a directory of JSON files, a single JSON object which has a simple array of resources, etc):

  • Create all of the objects described in the config, update them if they exist, report any failures, and potentially retry failures, in a way that fits with the described api conventions. Above this is called submit (another word that might work is apply)
  • Show the difference between the config and the actual status/state on the server
    • called --dry-run above
  • Display the current status of all of the objects in a config
    • either describe each object, or get each object
    • List any missing objects?
    • May be close enough to --dry-run to omit
  • Delete a config in bulk, retrying any failures - should this be hooked to delete or withdraw or something different? Or submit --rm?

@smarterclayton
Copy link
Contributor Author

@mfojtik do you want to take a second stab at this (#987 and port the openshift config object) and try to strip them down to the bare minimum?

@bgrant0607
Copy link
Member

#1743 has a good description of the OpenShift objects.

@mfojtik
Copy link
Contributor

mfojtik commented Oct 21, 2014

@smarterclayton yes, I will send initial PR tomorrow morning.

@smarterclayton
Copy link
Contributor Author

Some other types of input - we should take a stream of objects via STDIN and convert them so folks can find and cat into the object. Can be a secondary input, but use a streaming line decoder for it.

@bgrant0607
Copy link
Member

@smarterclayton IIRC, #987 was putting all the individual objects into a Config object. Do you plan to pursue a similar approach this time? I can understand why you'd want that for OpenShift.

However, I'd also like to retain the ability to run the object reconciliation directly from the client. Otherwise, 2 stages of reconciliation are needed, client Config to server-side Config and server-side Config to primitive objects.

Basic functionality we'd need:

  1. Collect selected objects and put them all into a simple JSON list
  2. Wrap a list of primitive/plugin objects with a Config object
  3. Unpack a list of primitive/plugin objects from a Config object
  4. Reconcile a list of objects with the API

In the case of client-side reconciliation, steps 2 and 3 would be skipped and step 4 would run in the client. In the case of server-side reconciliation, steps 1 and 2 would run in the client and step 4 would reconcile the Config object, then the server would perform step 3 and step 4 on the objects in the payload of the Config object.

WDYT?

@bgrant0607
Copy link
Member

Re. pluggability/composability leading to more support issues, as raised in #1695: We should definitely support a well tested default configuration pipeline that works for most simple scenarios out of the box. However, our experience is that not providing composable building blocks leads to large numbers of support issues, too, since (a) users submit lots of feature requests for things they can't easily add themselves, (b) the monolithic tool grows until it is unmaintainable, (c) users work around the lack of extensibility in unmaintainable ways, such as by using private/internal interfaces or accidental behavior, and (d) users use the monolithic tool in use cases for which it wasn't intended in order to take advantage of some narrow functionality it provides.

@bgrant0607
Copy link
Member

For operations like describe and delete:

The primitive operation we need is to generate a list of API URLs form the list of full object configurations. Then we can apply the appropriate verb or macro operation to them, map-style. It should also be possible to dump this list to stdout, for scriptability.

@bgrant0607
Copy link
Member

In terms of basic verbs, internally we use up and down (like Fig, I think) and update. It should be possible to repeatedly invoke kubectl up and down until they report success: so, they require idempotence and need to surface error conditions. We may also want to (optionally) be able to continue despite errors.

The reason to distinguish between up and update: If there are preexisting objects, there are at least 4 choices: abort, clobber (delete and recreate), disruptive update, and rolling update. The choice could also be made via flags. More complex update configuration probably should be done in JSON.

@smarterclayton
Copy link
Contributor Author

Our current Config is client side only - so no server reconciliation proposed. Config resource was simplest possible API object that could combine a bunch of unknown objects into one json object - it does not need to be given special treatment here for sure. For now an array of opaque objects is just as good (for reconciliation code)

@smarterclayton
Copy link
Contributor Author

Down seems to overlap with Stop to naive users in unpleasant ways. Are you guys content with that internally? Is down common in script setups, or is it more situational? If a config includes a persistent resource (like proposed durable data or volume) is there a risk of misinterpreting down?

@bgrant0607
Copy link
Member

I'm not wedded to the up/down terminology. I was mainly explaining why we distinguish up from update. We could use submit or apply with qualifiers.

I think down is mainly used situationally.

There is a risk of misinterpreting it. Usually people use it fairly selectively, by object type.

Deletion can be performed as a side effect during creation, in the case of create -force, which deletes prior objects if they exist (not a good idea to use in a loop until success).

@bgrant0607
Copy link
Member

Further thoughts after working on #1980 and talking to @lavalamp.

We could also create a simple transformation framework that would make it easy to register new transformation passes, vaguely similar to the approach of the generic scheduler predicate registration:
https://github.com/GoogleCloudPlatform/kubernetes/blob/f0f4092fc54b19dc3828e97307c11b7f1b7e8d86/plugin/pkg/scheduler/factory/factory.go#L69

kubectl would do something like the following:

  1. Collect selected objects and put them all into a simple JSON list
  2. Perform plugin macro-style string transformation passes on the list
  3. Recursively expand objects
  4. Perform plugin domain-specific transformation passes on the list, such as name/label/selector munging and image resolution
  5. Stable sort of objects in dependency order (e.g., services before replication controllers and pods, pod templates before replication controllers that reference them)
  6. Validate objects (early validation)
  7. Perform selected command

@smarterclayton
Copy link
Contributor Author

Some next steps now that #1958 and #2000 are in

  • Define a generic mechanism for mapping input sources to a stream of runtime.Objects (including unknown types) (call this an Object stream)
    • Includes directories, JSON files with arrays on disk, JSON files that are a standard List type, and JSON object streams (from HTTP / file etc)
    • Integrate this into createall
    • Ensure we have a good generic error reporting pattern
  • Determine whether we can simply move that function into create (just by relaxing our create arguments?) and remove create all
  • Should we also add that to kubectl update
    • Determine whether kubectl update should have create-or-update behavior (make it an option?)
  • Allow listing of items from an Object stream
  • Allow deletion of objects from an Object stream

It seems like these can be applied to the existing operations for Kubectl in some cases.

  • get takes command line parameters for a resource
  • delete takes a file or command line parameters
  • update and create take only a file

The simplest option is to allow multiple -f options to be passed to create/update/delete, and support -f on get. The next option might be to allow optional arguments to delete / get that can set up a stream of resources (kubectl get -- <file1> <url2> <directory3>). It might be best to start with -f though and use the standard path separator to handle that.

@bgrant0607
Copy link
Member

The above SGTM.

Note that deleting objects will require more client code unless we implement stop as discussed in #1535. I find myself doing the following quite often (e.g., after launching broken images, pods with the wrong restart policy or port spec, pods that depend on other services that I haven't yet started, etc.):

cluster/kubectl.sh get pods --no-headers | cut -d " " -f 1 | xargs -n 1 cluster/kubectl.sh delete pod 

Delete by label selector would be super-useful, also.

Will respond re. update behavior in a bit.

@bgrant0607
Copy link
Member

Updates:

I believe create currently fails with AlreadyExists if the object already exists, and that update fails if the object either doesn't exist or the correct resourceVersion isn't specified. I think both operations should accept a --force flag. On create, any objects that already exist should be deleted. On update, the object should be created if it doesn't exist and the resourceVersion should be ignored by the apiserver or simply extracted from the current object.

Extended update/reconciliation functionality:

  • Update preconditions based on the values of arbitrary object fields. One could think of this as adding some safety to --force, or as a more user-friendly form of precondition than resourceVersion.
  • An option to just change the specified fields, by GETing the current object and copying over the specified fields. Useful for surgical updates. Would be nice to be able to combine it with the precondition option.
  • Record configured fields in an annotation and merge the union of those fields and fields currently specified. Similar to the option above, except robust to fields being dropped from the config and falling back on default values.
  • Diff whole object, specified fields, or configured fields. One could think of this as update dry run. Dry run would be useful for create and delete, also.

Rolling update is another beast altogether. We can discuss that in #1353.

@bgrant0607
Copy link
Member

It would also be useful for kubectl create -f file.yaml objectkind to not barf in the case that there are other kinds of objects in the file. The other kinds of objects should simply be ignored. Think of the objectkind as a filter. This allows users to put all their configs into one file, but select just some objects if they need to do something unusual. For the same reason, kubectl create -f file.yaml objectkind objectname would be useful as well.

We could either make this flag-controlled, or barf in the case that no objects are selected from the object stream.

@smarterclayton
Copy link
Contributor Author

I'd also like to be able to specify multiple resource types on a get or delete label selector

kubectl delete pods,services -l foo=bar

On Nov 4, 2014, at 2:55 PM, bgrant0607 [email protected] wrote:

The above SGTM.

Note that deleting objects will require more client code unless we implement stop as discussed in #1535. I find myself doing the following quite often (e.g., after launching broken images, pods with the wrong restart policy or port spec, pods that depend on other services that I haven't yet started, etc.):

cluster/kubectl.sh get pods --no-headers | cut -d " " -f 1 | xargs -n 1 cluster/kubectl.sh delete pod
Delete by label selector would be super-useful, also.

Will respond re. update behavior in a bit.


Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor Author

Create currently doesn't allow name or namespace to be overridden, but should. Starts to get closer to templating but name is very useful.

You shouldn't need to use objectkind on create - RESTMapper makes that automatic.

On Nov 4, 2014, at 3:50 PM, bgrant0607 [email protected] wrote:

It would also be useful for kubectl create -f file.yaml objectkind to not barf in the case that there are other kinds of objects in the file. The other kinds of objects should simply be ignored. Think of the objectkind as a filter. This allows users to put all their configs into one file, but select just some objects if they need to do something unusual. For the same reason, kubectl create -f file.yaml objectkind objectname would be useful as well.


Reply to this email directly or view it on GitHub.

@bgrant0607
Copy link
Member

I agree you should not need objectkind or objectname on create or any other operation if you want to apply the operation to all objects in the file/stream. I was proposing that these be allowed to be specified as optional filters on the set of objects affected. This functionality is very widely used internally to narrow the scope of operations and to exert external control over operation order.

@ghodss
Copy link
Contributor

ghodss commented Nov 11, 2014

@smarterclayton What are the immediate next steps on #1905 (comment)? I have a couple days free that I can devote to a next task (e.g. creating the stream-of-object abstraction) if that makes sense.

@smarterclayton
Copy link
Contributor Author

Yeah, probably. The pipeline probably needs to return not only valid objects, but also deal with errors at any step in the pipeline (so a consumer can report them or make a decision), the raw object data if necessary (for cases where we can't decode), the source (filename, line of file, etc), and potentially be composable to allow filters / transforms of the underlying object. For example, on an update stream we want to try and update the resource version if our MetadataAccessor and Decoder are able to transform the object, but if not, then we just pass the object through.

Most of that complexity probably isn't needed now, but we'll need to know errors and source location for sure.

@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Dec 3, 2014
@smarterclayton
Copy link
Contributor Author

Object stream support is implemented in #3152, and will then be rolled out to all of the generic methods as needed.

@brendandburns
Copy link
Contributor

@smarterclayton @ghodss @bgrant0607

This is completed, correct?

@bgrant0607
Copy link
Member

Not entirely, but remaining items are covered by cli-roadmap.md.

@JeanMertz
Copy link

I couldn't find this in the cli-roadmap.md document (or any of its linked GH issues), and wasn't able to see any mention of it in this issue, so I'll just put this here, but probably this has been mentioned somewhere already:

$ kubectl create -f rc-*.json
error: Unexpected args: [rc-2.json rc-3.json rc-4.json rc-5.json]
see 'kubectl create -h' for help.
$ kubectl create -f rc-1.json,rc-2.json,rc-3.json,rc-4.json,rc-5.json
replicationcontrollers/consul-ivake
replicationcontrollers/consul-8mtxa
replicationcontrollers/consul-wzb43
replicationcontrollers/consul-fppmw
replicationcontrollers/consul-au3wp

it would be nice if the first syntax (ie. array of files) was supported.

@smarterclayton
Copy link
Contributor Author

Due to shell expansion it's tricky the way we currently have it.
Originally we wanted to allow args to the server and files in the same
command, so -f took an arg rather than modified the command. In the future
we might want to allow you to create from a template on the server (pod
template, generic template, RC from a deployment, etc) so we should
probably be careful about expanding -f. We could support another flag -F
that turns all arguments into files

On Sep 13, 2015, at 5:42 AM, Jean Mertz [email protected] wrote:

I couldn't find this in the cli-roadmap.md document (or any of its linked
GH issues), and wasn't able to see any mention of it in this issue, so I'll
just put this here, but probably this has been mentioned somewhere already:

$ kubectl create -f rc-*.json
error: Unexpected args: [rc-2.json rc-3.json rc-4.json rc-5.json]
see 'kubectl create -h' for help.

$ kubectl create -f rc-1.json,rc-2.json,rc-3.json,rc-4.json,rc-5.json
replicationcontrollers/consul-ivake
replicationcontrollers/consul-8mtxa
replicationcontrollers/consul-wzb43
replicationcontrollers/consul-fppmw
replicationcontrollers/consul-au3wp

it would be nice if the first syntax (ie. array of files) was supported.


Reply to this email directly or view it on GitHub
#1905 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/app-lifecycle area/client-libraries area/kubectl area/usability priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

6 participants