Skip to content
This repository has been archived by the owner on Apr 17, 2019. It is now read-only.

[WIP] DNS nanny - Autoscale replication controller based on number of nodes #1427

Closed

Conversation

girishkalele
Copy link
Contributor

Adaption of the addon-resizer nanny worked out pretty well.

@girishkalele
Copy link
Contributor Author

cc @thockin @roberthbailey

@girishkalele girishkalele self-assigned this Jul 23, 2016
@girishkalele girishkalele force-pushed the dns_nanny branch 4 times, most recently from 917d8c9 to cd4c32b Compare July 23, 2016 04:03
@roberthbailey roberthbailey assigned Q-Lee and unassigned girishkalele Jul 23, 2016
@roberthbailey
Copy link
Contributor

@Q-Lee since you wrote the addon-resizer can you take a look at this?

@girishkalele
Copy link
Contributor Author

@Q-Lee

I think the failsafe checks need a little change, instead of continuing with the old params, I think it will be more correct that if we fail to parse the config, we log.Fatal and just bail out of the container.
This will immediately surface to the operator that the configmap is not formatted correctly.

@girishkalele girishkalele force-pushed the dns_nanny branch 3 times, most recently from 862276c to a17f110 Compare July 26, 2016 00:40
@fgrzadkowski
Copy link
Contributor

Why aren't we using addon-resizer for this?

@fgrzadkowski
Copy link
Contributor

If we decide to write something new, please don't use a generic name like pod-autoscaler. It suggests that it's the recommended way for doing things, and I believe we all agree it's not. If this is intended to only solve dns problem, then maybe dns-nanny or dns-resizer?

@girishkalele
Copy link
Contributor Author

@fgrzadkowski

The existing addon-resizer works so differently - it doesn't resize replicas, but rather the cpu and memory limits of the pod itself and it works on pods and deployments, not on RCs - that it would have been pretty much two different functionalities in the same binary with more complicated flags than to make a new one for DNS RC replica resizing.

I will change the name to be DNS specific (although the new one is written as a generic sidecar container that can resize any RC it belongs to, so it can be dropped into any RC and it will auto-scale it based on the configmap params)

@girishkalele girishkalele force-pushed the dns_nanny branch 2 times, most recently from 0dc925c to fc7f812 Compare July 27, 2016 22:02
# pod-autoscaler

This container image watches over the number of schedulable nodes in the cluster and resizes
the number of replicas in its parent object. Works only for pods which are children of RCs or RSs objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a short explanation of HOW would be good.

@thockin
Copy link
Contributor

thockin commented Aug 2, 2016

You're going to have to break this into multiple commits. One for your
code and another for deps, at least. Github basically can't load this.

As a name, can I propose "horizontal-self-scaler" ?

On Wed, Jul 27, 2016 at 10:05 AM, Girish Kalele [email protected]
wrote:

@fgrzadkowski https://github.com/fgrzadkowski

The existing addon-resizer works so differently - it doesn't resize
replicas, but rather the cpu and memory limits of the pod itself and it
works on pods and deployments, not on RCs - that it would have been pretty
much two different functionalities in the same binary with more complicated
flags than to make a new one for DNS RC replica resizing.

I will change the name to be DNS specific (although the new one is written
as a generic sidecar container that can resize any RC it belongs to, so it
can be dropped into any RC and it will auto-scale it based on the configmap
params)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVNrTJrJcX_hxEWK3untAEJAzSE9Wks5qZ4_JgaJpZM4JTRpr
.

// "resourceVersion":"20339"}}
//

func (k *kubernetesClient) GetParentRc(namespace, podname string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note: if you give a function a public name (leading capital letter) you had better have a GoDoc comment explaining it. Given that this is all one package (which is fine) I doubt this needs a public name. Same elsewhere.

log.Fatal(err)
}
k8s := nanny.NewKubernetesClient(*podNamespace, *podName, clientset)
log.Printf("Looking for parent/owner of pod %s/%s\n", *podNamespace, *podName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a log.Infof?

@girishkalele
Copy link
Contributor Author

@Q-Lee

Will do...I am in the process of moving it to a new repo, will send out the PR for that one.

func (s Scaler) scaleWithNodes(numCurrentNodes uint64) uint64 {
newMap, err := ParseScalerParamsFile(s.ConfigFile)
if err != nil {
log.Fatalf("Parse failure: The configmap volume file is malformed (%s)\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you validate this at start up?

@erictune
Copy link
Contributor

erictune commented Aug 2, 2016

@mml this PR is trying to solve a problem that you have thought about: given a pod, find its controller (you in the context of disruption budget). You might have useful comments.

@erictune
Copy link
Contributor

erictune commented Aug 2, 2016

@bgrant0607 said not to use created-by, and to prefer to use Owner References. (which I think is GetOwnerReferences from pkg/api/meta/meta.go. @lavalamp and @caesarxuchao should know about Owner References.

@Q-Lee
Copy link
Contributor

Q-Lee commented Aug 2, 2016

The addon-resizer was written with the intent to be extended in this fashion if the use case ever arose. It should be straight forward to extend it in this fashion.

@lavalamp
Copy link
Contributor

lavalamp commented Aug 2, 2016

I'm not a fan of doing this from every DNS pod in the cluster.

@lavalamp
Copy link
Contributor

lavalamp commented Aug 2, 2016

We hope owner references will be in beta and on-by-default in 1.4 but it's not certain at this point.

@girishkalele
Copy link
Contributor Author

I created kubernetes/kubernetes#29922 to request addition of the Created-By annotation to ReplicaSets but we can close that if the OwnerReferences will work for going from ReplicaSet -> parent Deployments.

@thockin
Copy link
Contributor

thockin commented Aug 2, 2016

Have a simpler solution?

On Tue, Aug 2, 2016 at 3:46 PM, Daniel Smith [email protected]
wrote:

I'm not a fan of doing this from every DNS pod in the cluster.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVCfhT1_CoYaLHZGBpAV1-4uyfxwTks5qb8i4gaJpZM4JTRpr
.

@lavalamp
Copy link
Contributor

lavalamp commented Aug 2, 2016

Run one sizing thing on the master node. Alternatively, run one of these
pods in its own RS.

On Tue, Aug 2, 2016 at 4:31 PM, Tim Hockin [email protected] wrote:

Have a simpler solution?

On Tue, Aug 2, 2016 at 3:46 PM, Daniel Smith [email protected]
wrote:

I'm not a fan of doing this from every DNS pod in the cluster.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1427 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AFVgVCfhT1_CoYaLHZGBpAV1-4uyfxwTks5qb8i4gaJpZM4JTRpr

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglgBAE44o_nOBYsZ6cZOasHyzXj58ks5qb9NRgaJpZM4JTRpr
.

@Q-Lee
Copy link
Contributor

Q-Lee commented Aug 3, 2016

There should be no assumptions in the code that this container must scale its own pod. Inferring defaults is fine, but I need to be able to explicitly override that behavior. And if owner references are not a sure thing for 1.4, then we may consider dropping the inference altogether in this release.

@thockin - I think dan has a good point. If we extend the addon-resizer to another type of scaling, then we could also change it to scale multiple pods. Then it can consume a yaml object per scale, and a single pod could watch all of our addons.

The only down side is that this pod would require a deployment every time an addon is enabled or disabled.

@thockin
Copy link
Contributor

thockin commented Aug 3, 2016

I figured running as a sidecar was cute, but I guess it could just be a
standalone pod that takes a string argument like
"replicationcontrollers/kube-dns-v19" or "deployments/kube-dns".

What do you think Girish?

For the sake of simplicity, do we really want to jam this together with the
existing vertical sizer? They are pretty different things.

On Tue, Aug 2, 2016 at 5:53 PM, Q-Lee [email protected] wrote:

There should be no assumptions in the code that this container must scale
its own pod. Inferring defaults is fine, but I need to be able to
explicitly override that behavior. And if owner references are not a sure
thing for 1.4, then we may consider dropping the inference altogether in
this release.

@thockin https://github.com/thockin - I think dan has a good point. If
we extend the addon-resizer to another type of scaling, then we could also
change it to scale multiple pods. Then it can consume a yaml object per
scale, and a single pod could watch all of our addons.

The only down side is that this pod would require a deployment every time
an addon is enabled or disabled.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVMuVPhhEMbG70ebGEPCr47TX4mU7ks5qb-Z7gaJpZM4JTRpr
.

@girishkalele
Copy link
Contributor Author

@thockin That works too - we can add another RC that provides one replica of the horizontal resizer and point it towards the rc/deployment it should be scaling.

The addon-resizer scales the fluentd pod vertically and the DNS use-case is horizontal scaling. My first look at it, I figured combining them would make it overly complex for a single component, and would be better as two independent modules.
@Q-Lee suggested we can combine them into a single all-purpose scaler. Perhaps we can look at the complexity of the configuration to combine them and decide if combining is simpler.

@thockin
Copy link
Contributor

thockin commented Aug 3, 2016

I'd say - IF we want to combine them, we do it later. Let's get something
implemented. If this just takes an arg, it's even simpler.

On Tue, Aug 2, 2016 at 9:48 PM, Girish Kalele [email protected]
wrote:

@thockin https://github.com/thockin That works too - we can add another
RC that provides one replica of the horizontal resizer and point it towards
the rc/deployment it should be scaling.

The addon-resizer scales the fluentd pod vertically and the DNS use-case
is horizontal scaling. My first look at it, I figured combining them would
make it overly complex for a single component, and would be better as two
independent modules.
@Q-Lee https://github.com/Q-Lee suggested we can combine them into a
single all-purpose scaler. Perhaps we can look at the complexity of the
configuration to combine them and decide if combining is simpler.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVCB4XZRmVgxCHXWATPapdOl8wL0Fks5qcB2SgaJpZM4JTRpr
.

@girishkalele
Copy link
Contributor Author

To summarize a few of the points:

  1. Current config file format is ugly, switch to a standard format - switching to JSON. (Embedding yaml strings inside the configmap yaml seems to give me indentation headaches).
  2. Use ConfigMap ResourceVersion to optimize refresh logic.
  3. Remove created-by and/or OwnerReference magic and point to the target RC/RS/Deployment via cmd line args. This scaler will run as an RC independent from the controller it is scaling.
  4. Implement horizontal scaler independently for now, combine them later if possible.
  5. Close this PR and create a PR from the new repo.

@girishkalele
Copy link
Contributor Author

Moved out of contrib repo.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants