Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use placement groups for low latency networking between nodes in an instance group #367

Closed
erutherford opened this issue Aug 24, 2016 · 28 comments
Milestone

Comments

@erutherford
Copy link

Placement groups can lower the latency for network traffic between nodes and increase bandwidth between nodes. kops should allow for a option which allows people to use placement groups in AWS when spinning up instance groups.

@erutherford erutherford changed the title Use placement groups for low latency networking in the cluster Use placement groups for low latency networking between nodes in an instance group Aug 24, 2016
@zapman449
Copy link

  1. Can you ASG a placement group?
  2. This basically forces you into a single AZ cluster
  3. As noted in slack, this does increase the blast radius of a single AWS Hardware system catastrophically failing
  4. Making this a 'non-default' option is reasonable.

@erutherford
Copy link
Author

There are definitely some limitations, but to Justin's point (in slack) people should be allowed to at least specify it as an option when spinning up an instance group (which is what you're saying in 4)

  1. Can you ASG a placement group?

Based off the documentation it appears that it is supported. They just mention that it's ideal to add all members to the placement group when you initialize it or you may run into issues with a not enough resources error, but you can also run into that error when stoping and starting a node in a placement group.

@justinsb justinsb modified the milestone: 1.3.1 Sep 24, 2016
@justinsb justinsb modified the milestones: 1.5.0, 1.5, 1.5.1 Dec 28, 2016
@dlutsch
Copy link

dlutsch commented Mar 23, 2017

Is there any update on this feature?

@svperfecta
Copy link

Note: I think the idea is to create a single placement group per availability zone.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 13, 2018
@layer3switch
Copy link

As per @zapman449 questions,

  1. Can you ASG a placement group?

Yes. ASG via Launch Template and Cloudformation supports this.

  1. This basically forces you into a single AZ cluster

No longer this is true. AWS supports something called Spread Placement Groups. This allows max 7 instances per AZ per placement group. At this time, this seems to be hard limit.

  1. As noted in slack, this does increase the blast radius of a single AWS Hardware system catastrophically failing

Spread Placement Groups address this concern.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@jnicholls
Copy link

jnicholls commented Apr 13, 2018

/reopen Can we reopen this? I think it's pertinent for kops to support placement groups for node groups (both cluster and spread variants).

@suchakjani
Copy link

suchakjani commented May 4, 2018

We need placement groups for things like NFT tests for certain applications.
Would appreciate if this is done

@k8s-ci-robot
Copy link
Contributor

@suchakjani: you can't re-open an issue/PR unless you authored it or you are assigned to it.

In response to this:

/reopen
We need placement groups for things like NFT tests for certain applications.
Would appreciate if this is done

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@erutherford
Copy link
Author

/reopen
I don't need it anymore, but it seems others do and I still think it was a good idea.

@k8s-ci-robot k8s-ci-robot reopened this May 4, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@phs
Copy link

phs commented Sep 17, 2018

My use case is looking at multiple clusters, each sandboxed to a single AZ.

A spread placement group would therefore be useful specifically for the masters.

@erutherford
Copy link
Author

/remove-lifecycle rotten
/reopen

@k8s-ci-robot
Copy link
Contributor

@erutherford: Reopening this issue.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 17, 2018
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 17, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2018
@qrevel
Copy link

qrevel commented Jan 13, 2019

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 13, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 13, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 13, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@geekofalltrades
Copy link
Contributor

We would like this as well. Multiple folks have expressed interest in this over time. Can it be reopened and /lifecycle frozen so it doesn't rot again?

@MahaGamal
Copy link

MahaGamal commented Jan 14, 2020

I think it's pertinent for kops to support placement groups for node instance groups (both cluster and spread ).
/reopen

@k8s-ci-robot
Copy link
Contributor

@MahaGamal: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 14, 2020
justinsb added a commit to justinsb/kops that referenced this issue Dec 2, 2020
The important PR we want to pick up is 369, fixing a bug when
ListenMetricsURLS is set as an env var.

Full changelist:

* Release notes for 3.0.20201117 [kubernetes#364](kopeio/etcd-manager#364)
* Fix gofmt [kubernetes#365](kopeio/etcd-manager#365)
* Add gofmt check to github actions [kubernetes#366](kopeio/etcd-manager#366)
* Add boilerplate to tools/deb-tools/main.go [kubernetes#367](kopeio/etcd-manager#367)
* Do not set ListenMetricsURLS [kubernetes#369](kopeio/etcd-manager#369)
* Fix bazel formatting [kubernetes#370](kopeio/etcd-manager#370)
hakman pushed a commit to hakman/kops that referenced this issue Dec 2, 2020
The important PR we want to pick up is 369, fixing a bug when
ListenMetricsURLS is set as an env var.

Full changelist:

* Release notes for 3.0.20201117 [kubernetes#364](kopeio/etcd-manager#364)
* Fix gofmt [kubernetes#365](kopeio/etcd-manager#365)
* Add gofmt check to github actions [kubernetes#366](kopeio/etcd-manager#366)
* Add boilerplate to tools/deb-tools/main.go [kubernetes#367](kopeio/etcd-manager#367)
* Do not set ListenMetricsURLS [kubernetes#369](kopeio/etcd-manager#369)
* Fix bazel formatting [kubernetes#370](kopeio/etcd-manager#370)
@minkimipt
Copy link
Contributor

I think it's still relevant. Can we reopen?
/reopen

@k8s-ci-robot
Copy link
Contributor

@minkimipt: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

I think it's still relevant. Can we reopen?
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@olemarkus
Copy link
Member

I suggest opening a new issue that details what you want to achieve and how. I'd say its a fairly high chance you need to contribute the PR as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests