Add ability to assign custom name to NEG via annotation. #919

jaceq · 2019-10-29T10:07:12Z

As per titile, I'd like to be able to assign a custom, provided name to NEG via annotation (NEGAttribure).

It seems there is a mention of this in code: https://godoc.org/k8s.io/ingress-gce/pkg/annotations#NegAttributes

jaceq · 2020-02-07T09:09:34Z

@freehan @rramkumar1 Any update on this?

zachdaniel · 2020-02-21T02:48:25Z

This would be really great!

deedf · 2020-02-23T12:48:18Z

If it would let the user specify a stable NEG name that would pre-exist, and then keep that NEG in sync, it would be really helpful.

bowei · 2020-02-23T21:03:50Z

@freehan -- what do you think? this seems low hanging fruit.

deedf · 2020-02-25T17:25:07Z

The intended usage for me is to register a bunch of NEGs as backends to a GCLB backend service, and have the membership of those known NEGs be managed automatically.

I think I need to do this because there currently does not seem to be a pure k8s solution to multi-cluster ingress.
There is currently GoogleCloudPlatform/gke-autoneg-controller that can something similar but you need to run a custom controller, and I think this could be handled entirely at the NEG level without caring about the Backend part.

There is also this nice writeup there https://blog.jetstack.io/blog/container-native-multi-cluster-glb/ where the optimal setup ends up being a single global lb with k8s services as backends, and they have to jump through hoops to inject the automation generated NEG names into their terraform config.

If the name could be specified in advance for an existing NEG, there would be a really clean solution to multi-cluster deployments on GCP using global container native load balancing, pending a pure k8s solution.

Of course there are numerous implementation details to discuss if there is further interest.

spencerhance · 2020-02-27T15:00:01Z

@deedf https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-for-anthos might also work for you

deedf · 2020-02-28T09:12:42Z

@deedf https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-for-anthos might also work for you

Looks like it might, thanks !

Also, thinking about it more, for the lightweight use case where you just want your deployments to register to an externally configured load balancer, I think the approach used by GoogleCloudPlatform/gke-autoneg-controller is actually exactly right, since using pre-defined NEGs would force the user to create them in every possible zone beforehand.

zachdaniel · 2020-02-28T14:07:21Z

I didn’t know about https://github.com/GoogleCloudPlatform/gke-autoneg-controller, and I’ll probably use it to solve for this 😍

bowei · 2020-03-04T19:49:41Z

Note: we are looking at the autoneg use case -- so there will likely be something more "official" integrated into ingress/service.

zachdaniel · 2020-03-04T22:02:55Z

I've tried out the gke-autoneg-controller and I wasn't able to get it working in the time I had to spend on it. Someone with more ops chops than me might be able to though. But my ideal would absolutely be just allowing us to name it in the annotation :)

jaceq · 2020-03-05T09:28:46Z

I tried with autoneg but didn't get it to work... it seems complicated, and documentation isn't great...
Overall as I understand steps needed:
-> enable cloud-platform oauth scope
-> install autoneg deployment, roles etc etc
-> create LB backend services (without groups)
-> create compatible healthchecks
-> create services with matching autoneg / neg annotations...

Also, in my specific case (I use Terraform a lot) using autoneg introduces 'hidden' (from terraform point) dependencies.
In my case just being able to name a NEG via annotation would resolve my issues and allow me to get this to work in no-time.

ps. I did get autoneg to work when I noticed how many components I'd have re-write from .yaml to terraform just to get it to run (again, I use Terraform a lot)

bowei · 2020-03-05T17:27:26Z

cc: @mark-church

mark-church · 2020-03-06T18:22:07Z

@deedf @jaceq @zachdaniel I'm interested to know more about autoneg vs NEG naming and how they impact your use-cases. We are looking at solutions for implementation of this in the GKE Svc/NEG controller. Here are the pros and cons in my mind. Would love to know your thoughts:

NEG naming (allowing NEG name to be specified in service.yaml)
- Pros: much faster to implement and ship, flexible and straightforward solution
- Cons: puts burden of unique name generation on user, conflicts would fail and so checking existing NEGs might be required, still requires the step of adding NEG to backend-service
Autoneg (allowing BES to be specified in service.yaml)
- Pros: automatically connects NEG to BES without extra step, requires fewer manual steps from user
- Cons: some corner cases that need to be figured out such as how to handle backend addition/removal if BES or NEG is deleted so some additional complexity that will take more time (@freehan can probably add more detail).

Also, what are the specific use-cases and which LBs (internal/external) do you have for standalone NEGs - multi-cluster, using LB features not supported in GKE, or doing manual deployment just for fun?

deedf · 2020-03-08T18:51:25Z

Mark, thanks for your interest.

My use case is configuring a global backend service in front of multiple regional backends. I did not know about Ingress for Anthos despite doing quite some searching, the S/N ratio wasn't that good. Still, it seems a bit top heavy from skimming through the docs, but I need to look at it more so I'll leave it aside.

The original idea was to have a single global backend service configured outside of k8s, and a way to automatically register deployments as backends when and where they were scheduled manually or automatically. It certainly was not for fun, just to work around the k8s cluster split brain syndrome in the face of global resources, and to use features not supported by k8s.

Trying to expand on your points:

NEG naming (allowing NEG name to be specified in service.yaml)
- Cons: requires pre-registering the NEGs in every possible zone in advance, since NEGs are zone bound AIUI. Then you have to keep them up date with zone turnups or teardowns, etc
Autoneg (allowing BES to be specified in service.yaml)
- Pros: acts at schedule/deschedule time instead of requiring preparation in advance.

jaceq · 2020-03-09T09:11:11Z

@mark-church I am using GCLB with features unsupported by ingress-gce (cloud armor - ip whitelisting, and custom health checks - due to basic auth in place, some of my endpoints do not return 200 without Authorization headers)

Honestly I think it would make sense to have both options (named NEGs and autoneg ) so there would be a option to choose best solution.

Also, in my specific case, I use IaaC (Terraform) extensively, and using named NEGs would simply make my life easier in such case given that dependecy graph would be handled by TF only and not mixed - with autoneg I introduce 'silent' dependencies as at the time of service creation my backend has to already be in place (otherwise I get not registration) - and that is not given in terraform as it is not aware of that dependency.

samschlegel · 2020-03-16T20:01:32Z

+1 to NEG naming

We also use Terraform extensively, and are currently looking into moving services from MIGs into Kubernetes. Not having a way to know the name of the NEG in advance means we run into similar issues of having to create the Service, pull the auto-created NEG name down, and then manually pass in the name to our Terraform config. I'd also feel much safer having control over this NEG, as I'm not sure what would cause a potential recreation of the NEG, leading to a name change and service downtime.

samschlegel · 2020-03-16T20:10:05Z

Perhaps what we'd need is less custom naming, and more the ability to provide a self link to an externally created NEG that the controller should manage. We're currently running into issues trying to use the Internal HTTP Load Balancer in a Shared VPC, as all the resources for that must live in the host project, but the NEG this controller creates lives in the service project

jaceq · 2020-03-17T07:40:54Z

@samschlegel Basically there are 2 options in our case.

Pass custom name to NEG
OR
Be able to read random name back after NEG creation, in fact I opened ticket for that too: Add service attribute that would return name of a NEG hashicorp/terraform-provider-kubernetes#668

deedf · 2020-03-17T07:51:33Z

I think the naming problem is also solved by having an annotation in a Service that tells it which Backend Service to register to, just like gke-autoneg does. Then you don't care if the NEG name is generated or not.

I also don't understand how people who advocate using pre-registered NEGs plan to keep them in sync with zone turnups/turndowns (NEGs are per.zone).

zachdaniel · 2020-03-18T16:28:16Z

We’re using kustomize + cloud config connector, so we can easily generate resources with unique names (using a prefix/suffix). Being able to name the NEG would be perfect for us.

We want to register that NEG with a compute URL map.

samschlegel · 2020-03-20T03:42:12Z

@deedf We define services in Bazel and generate both our k8s manifests and our terraform configs there as well, so spinning up or tearing down a zone is just modifying a list we pass in when generating configs

I currently don't like the idea of having the backend service not managed by Terraform as that's how we manage all of our MIG-related infrastructure, so something like gke-autoneg is out of the picture for us

deedf · 2020-03-21T20:15:12Z

@samschlegel the "how" to spin up or tear down a zone is trivial, the problem is the "when". How do you know that a GCP zone has been spun up or torn down ? How do you hook your update process to these changes ?
Also I don't really understand the argument about not wanting the backend service not managed by Terraform since what gke-autoneg does IIUC is to keep the NEGs in sync with a preexisting backend service that can very well be created by Terraform. What gke-autoneg manages is backends not backend services.

samschlegel · 2020-03-26T19:47:18Z

Ah looking at it more autoneg would probably work for us, it just means we need to make Terraform non-authoritative in what backends a backend service has.

re: "when" to spin up or tear down a zone, perhaps I'm misunderstanding what you mean there. all of our zone spin up and tear downs are manual, so the "when" is just part of our deployment scripts.

tonybenchsci · 2020-04-30T21:27:40Z

We’re using kustomize + cloud config connector, so we can easily generate resources with unique names (using a prefix/suffix). Being able to name the NEG would be perfect for us.

We want to register that NEG with a compute URL map.

Same for our company. Using kustomize + KCC, but have to copy the autogenerated NEG name after creating application services, kind of a pain.

bert-laverman · 2020-05-04T10:26:07Z

@deedf https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-for-anthos might also work for you

This controller contains several errors., worst of which is that the example uses a Backend Service name with an underscore in it, which is not allowed. I tried this before realising the mistake, and now I have a Service that cannot be deleted because the controller won't progress past the error in the name, and it blocks.

I am no Go (or k8s internals) developer, so I do not know how to solve this. I cannot even throw away the test namespace because it waits for that service to disappear.

bert-laverman · 2020-05-05T07:41:07Z

Update: It appears you can actually delete stuff, by manually editing the state of the Service and removing the finalizer pointing at the failing code.

fejta-bot · 2020-08-03T08:16:44Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

sho-abe · 2020-08-11T01:22:05Z

I think that, the feature of header/query based routing without addons(e.g. Istio) is useful.
/remove-lifecycle stale

mark-church · 2020-08-11T01:33:07Z

Hi all, custom NEG naming will be coming fairly soon. We'll be releasing it to GKE 1.18 in the late Aug-Sept timeframe.

tonybenchsci · 2020-09-22T21:34:07Z

@mark-church Any updates?

mark-church · 2020-09-22T22:22:05Z

Yup - it's currently targeted to roll out to GKE 1.18 as Beta functionality in the first week of October. Please don't be surprised if things are off by a week or two :)

Because there are some major changes to the ingress controller in this rollout it's unlikely that we will be able to safely backport to older GKE versions. It is likely that this will just be available in GKE 1.18 and newer releases.

deedf · 2020-10-19T11:12:15Z

Yup - it's currently targeted to roll out to GKE 1.18 as Beta functionality in the first week of October. Please don't be surprised if things are off by a week or two :)

@mark-church Any news ?

fejta-bot · 2021-01-17T11:30:01Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

jaceq · 2021-01-18T08:57:53Z

let's not close that dear @fejta-bot

freehan · 2021-01-29T18:38:18Z

This feature is in public preview. https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg#create_a_service

Feel free to try it out and report any issues. Thanks!

fejta-bot · 2021-02-28T19:03:56Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

rramkumar1 assigned freehan Nov 19, 2019

rramkumar1 added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 11, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 28, 2021

freehan closed this as completed Mar 1, 2021

petomalina mentioned this issue Jun 25, 2021

Allow using existing NEG when specifying name in the annotation #1497

Closed

Add ability to assign custom name to NEG via annotation. #919

Add ability to assign custom name to NEG via annotation. #919

Comments

jaceq commented Oct 29, 2019

jaceq commented Feb 7, 2020

zachdaniel commented Feb 21, 2020

deedf commented Feb 23, 2020

bowei commented Feb 23, 2020

deedf commented Feb 25, 2020

spencerhance commented Feb 27, 2020

deedf commented Feb 28, 2020

zachdaniel commented Feb 28, 2020

bowei commented Mar 4, 2020

zachdaniel commented Mar 4, 2020

jaceq commented Mar 5, 2020

bowei commented Mar 5, 2020

mark-church commented Mar 6, 2020

deedf commented Mar 8, 2020

jaceq commented Mar 9, 2020 • edited Loading

samschlegel commented Mar 16, 2020

samschlegel commented Mar 16, 2020

jaceq commented Mar 17, 2020

deedf commented Mar 17, 2020

zachdaniel commented Mar 18, 2020 • edited Loading

samschlegel commented Mar 20, 2020 • edited Loading

deedf commented Mar 21, 2020

samschlegel commented Mar 26, 2020 • edited Loading

tonybenchsci commented Apr 30, 2020

bert-laverman commented May 4, 2020

bert-laverman commented May 5, 2020

fejta-bot commented Aug 3, 2020

sho-abe commented Aug 11, 2020

mark-church commented Aug 11, 2020

tonybenchsci commented Sep 22, 2020

mark-church commented Sep 22, 2020

deedf commented Oct 19, 2020

fejta-bot commented Jan 17, 2021

jaceq commented Jan 18, 2021

freehan commented Jan 29, 2021

fejta-bot commented Feb 28, 2021

jaceq commented Mar 9, 2020 •

edited

Loading

zachdaniel commented Mar 18, 2020 •

edited

Loading

samschlegel commented Mar 20, 2020 •

edited

Loading

samschlegel commented Mar 26, 2020 •

edited

Loading