Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2096376: [openstack] Remove limitation on single node deployments #5997

Closed
wants to merge 1 commit into from

Conversation

stephenfin
Copy link
Contributor

There should no longer be any issues running router pods on control plane nodes (i.e. kubernetes/kubernetes#65618 which was resolved in kubernetes/enhancements#1144). Remove this limitation from the docs.

@openshift-ci openshift-ci bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@stephenfin: This pull request references Bugzilla bug 2096376, which is invalid:

  • expected the bug to target the "4.11.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2096376: Remove limitation on single node deployments

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from gryf and mdbooth June 13, 2022 16:34
@stephenfin
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@stephenfin: This pull request references Bugzilla bug 2096376, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @eurijon

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from eurijon June 13, 2022 16:34
@rna-afk
Copy link
Contributor

rna-afk commented Jun 15, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 15, 2022
Copy link
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

Let's make the UPI job work and exercise this change before merging.

We should at least have:

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 27, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 29, 2022

@stephenfin: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sadasu
Copy link
Contributor

sadasu commented Sep 22, 2022

@mandre it appears that we can remove the hold?
@stephenfin could you please rebase this PR?

@sadasu
Copy link
Contributor

sadasu commented Sep 22, 2022

/retitle Bug 2096376: [openstack] Remove limitation on single node deployments

@openshift-ci openshift-ci bot changed the title Bug 2096376: Remove limitation on single node deployments Bug 2096376: [openstack] Remove limitation on single node deployments Sep 22, 2022
@mandre
Copy link
Member

mandre commented Sep 23, 2022

/hold cancel
We now have a proper UPI job to test the change.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2022
There should no longer be any issues running router pods on control
plane nodes (i.e. kubernetes/kubernetes#65618
which was resolved in
kubernetes/enhancements#1144). Remove this
limitation from the docs.

Signed-off-by: Stephen Finucane <[email protected]>
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 23, 2022
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 23, 2022

New changes are detected. LGTM label has been removed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 23, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from rna-afk by writing /assign @rna-afk in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sadasu
Copy link
Contributor

sadasu commented Sep 23, 2022

/test e2e-openstack-upi

@mandre
Copy link
Member

mandre commented Sep 27, 2022

Not sure why, but we're missing the must-gather for the UPI job, making it difficult to debug. Let's try one more time.
/test e2e-openstack-upi

@mandre
Copy link
Member

mandre commented Sep 28, 2022

We should now be gathering the logs thanks to openshift/release#32676.
/test e2e-openstack-upi

@mandre
Copy link
Member

mandre commented Sep 29, 2022

From the logs, it looks like a timeout.

From the authentication cluster operator, one of the pods was not ready:

    Message:               APIServerDeploymentDegraded: 2 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (container is not ready in apis
erver-6cd4bddfc9-x2t67 pod)                                                                                                                                               
OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()                                                 
OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.wm7r4jwq-3409b.shiftstack.devcluster.openshift.com/healthz": dial tcp 10.0.0.7:443
: i/o timeout (Client.Timeout exceeded while awaiting headers)                                                                                                            
    Reason:                APIServerDeployment_UnavailablePod::OAuthServerDeployment_UnavailablePod::OAuthServerRouteEndpointAccessibleController_SyncError               

/test e2e-openstack-upi

@mandre
Copy link
Member

mandre commented Sep 30, 2022

A timeout again. But this time it went further, and only authentication co was unavailable. Last time console was also unavailable and we had a handful of other co still progressing.
This might have to do with masters being schedulable, where the backoff causes the system to take longer to stabilize?
Note that in both cases, ingress co reported degraded.

    Message:               The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)

/test e2e-openstack-upi

@mandre
Copy link
Member

mandre commented Sep 30, 2022

This time, authentication co is available, but console co is not.
Ingress co is healthy, which indicates the installation would likely have succeeded if we gave this cluster more time.

We should understand if the long time it takes for the cluster to converge to a healthy state is caused by the patch or not before merging.

@sadasu
Copy link
Contributor

sadasu commented Nov 21, 2022

/retest-required

@sadasu
Copy link
Contributor

sadasu commented Nov 21, 2022

/test e2e-openstack-upi

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 22, 2022

@stephenfin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-upi 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-gcp-upi
ci/prow/e2e-azure-upi 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-azure-upi
ci/prow/e2e-aws-upi 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-aws-upi
ci/prow/e2e-aws 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-aws
ci/prow/e2e-azure 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-azure
ci/prow/e2e-gcp 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-gcp
ci/prow/e2e-vsphere 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-vsphere
ci/prow/e2e-aws-ovn 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-aws-ovn
ci/prow/e2e-azure-ovn 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-azure-ovn
ci/prow/e2e-vsphere-ovn 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-vsphere-ovn
ci/prow/e2e-gcp-ovn 93f704631d139f6aa07d842a35552961957db0a4 link true /test e2e-gcp-ovn
ci/prow/e2e-openstack-proxy 6f1add7 link false /test e2e-openstack-proxy
ci/prow/okd-e2e-gcp-ovn-upgrade 6f1add7 link false /test okd-e2e-gcp-ovn-upgrade
ci/prow/e2e-openstack-parallel 6f1add7 link false /test e2e-openstack-parallel
ci/prow/okd-scos-unit 6f1add7 link true /test okd-scos-unit
ci/prow/okd-scos-verify-codegen 6f1add7 link true /test okd-scos-verify-codegen
ci/prow/e2e-openstack-upi 6f1add7 link false /test e2e-openstack-upi

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mandre
Copy link
Member

mandre commented Nov 24, 2022

/hold
We don't have conclusive evidence that removing the step works.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 24, 2022
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 14, 2023
@mdbooth
Copy link
Contributor

mdbooth commented Apr 17, 2023

@stephenfin is this still a thing?

@stephenfin
Copy link
Contributor Author

Probably not. I've lost the context here.

@stephenfin stephenfin closed this Apr 21, 2023
@stephenfin stephenfin deleted the bug/2096376 branch April 21, 2023 10:40
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 21, 2023

@stephenfin: This pull request references Bugzilla bug 2096376. The bug has been updated to no longer refer to the pull request using the external bug tracker.

In response to this:

Bug 2096376: [openstack] Remove limitation on single node deployments

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants