Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create manual network instead of automatic network for PR jobs. #4472

Closed
MrHohn opened this issue Sep 8, 2017 · 12 comments
Closed

Create manual network instead of automatic network for PR jobs. #4472

MrHohn opened this issue Sep 8, 2017 · 12 comments
Assignees

Comments

@MrHohn
Copy link
Member

MrHohn commented Sep 8, 2017

Ref quota issues on k/k:
kubernetes/kubernetes#46713
kubernetes/kubernetes#47362
kubernetes/kubernetes#48688
kubernetes/kubernetes#51646
kubernetes/kubernetes#52140

Lately we've been hitting subnetwork quota issue periodically when GCP expands regions, as each PR job creates an autonetwork, which contains one subnetwork for each region. When quota issue occurs on PR jobs, it likely blocked the submit queue (by preventing PR from getting in SQ). This affects our development velocity in certain degree.

Per discussion on kubernetes/kubernetes#51136 (comment) and kubernetes/kubernetes#51646 (comment), I think we should figure out a way to use manual network for PR jobs to avoid hitting subnetwork quota issue again.

Not sure who is the right person to assign. I will try to take a look whether it is doable and how much work it will be.

/assign
cc @krzyzacy @yujuhong @nicksardo @cblecker @xiangpengzhao FYI

@krzyzacy
Copy link
Member

krzyzacy commented Sep 8, 2017

I'm happy to help this issue! (if I know what to do ❀◟(ó ̯ ò, ))

@cblecker
Copy link
Member

cblecker commented Sep 8, 2017

Wholeheartedly support this. :)

@yujuhong
Copy link
Contributor

yujuhong commented Sep 8, 2017

I will try to take a look whether it is doable and how much work it will be.

Thanks!

@nicksardo
Copy link
Contributor

If everything uses a manual network, should we create a new suite of networking tests that continue to run on an automatic network (and limit the # of jobs in that project to something low)?

@yujuhong
Copy link
Contributor

yujuhong commented Sep 8, 2017

If this is only a problem for pre-submit jobs, maybe we can keep the post-submit jobs on auto networks? Ignore me if that's not the case.

@MrHohn
Copy link
Member Author

MrHohn commented Sep 8, 2017

If this is only a problem for pre-submit jobs, maybe we can keep the post-submit jobs on auto networks? Ignore me if that's not the case.

Thanks, I was about to say the same thing, we could keep post-submit jobs on auto networks :)

@MrHohn
Copy link
Member Author

MrHohn commented Sep 13, 2017

Did a bit investigation, two high level tasks in my mind:

  • Change gce kube-up scripts to support creating custom network.
  • Change PR jobs to use that feature.

Sent kubernetes/kubernetes#52377 for the first part.

cc @bowei

@krzyzacy
Copy link
Member

any updates :-)

@MrHohn
Copy link
Member Author

MrHohn commented Sep 29, 2017

kubernetes/kubernetes#52377 is still pending for review, sending a ping.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Oct 12, 2017
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[GCE kube-up] Allow creating/deleting custom network

**What this PR does / why we need it**:
From kubernetes/test-infra#4472.

This is the first step to make PR jobs use custom network instead of auto network (so that we will be less likely hitting subnetwork quota issue). 

The last commit is purely for testing out the changes on PR jobs. It will be removed after review.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE.

**Special notes for your reviewer**:
/assign @bowei @nicksardo 

**Release note**:

```release-note
NONE
```
@MrHohn
Copy link
Member Author

MrHohn commented Oct 12, 2017

Finally kubernetes/kubernetes#52377 is merged. I'd like to work on the second part (see #4472 (comment)) and start making PR jobs create custom network.

For rolling out, what about four phases:

  1. One non-blocking PR job.
  2. All non-blocking PR jobs.
  3. One blocking PR job.
  4. All blocking PR jobs.

For phase one, I think pull-kubernetes-e2e-gce-gpu may be a good candidate. Or any suggestion?

cc @mindprince

@krzyzacy
Copy link
Member

maybe pull-kubernetes-e2e-gke-gpu to start with, since that's manual triggered and can causes less damage if breaks.

@MrHohn
Copy link
Member Author

MrHohn commented Oct 12, 2017

maybe pull-kubernetes-e2e-gke-gpu to start with, since that's manual triggered and can causes less damage if breaks.

Ack, sent #4988.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants