Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Columbia GKE VMs can't have external IPs. Use NAT instead? #29

Closed
cisaacstern opened this issue Jan 28, 2022 · 3 comments · Fixed by #30
Closed

Columbia GKE VMs can't have external IPs. Use NAT instead? #29

cisaacstern opened this issue Jan 28, 2022 · 3 comments · Fixed by #30

Comments

@cisaacstern
Copy link
Member

In working on #19, I ran into the following error upon running make deploy (pretty sure it's triggered by this line):

│ Error: Error waiting for creating GKE cluster: 
│       (1) Not all instances running in IGM after 27.91542561s. Expected 1, running 0, transitioning 1. Current errors: [CONDITION_NOT_MET]: Instance 'gke-pfcsb-cluster-default-pool-83972429-mhbp' creation failed: Constraint constraints/compute.vmExternalIpAccess violated for project 13667658525. Add instance projects/pangeo-forge-4967/zones/us-central1-f/instances/gke-pfcsb-cluster-default-pool-83972429-mhbp to the constraint to use external IP with it
│       (2) Not all instances running in IGM after 29.8823857s. Expected 1, running 0, transitioning 1. Current errors: [CONDITION_NOT_MET]: Instance 'gke-pfcsb-cluster-default-pool-5f0b4f10-v4xk' creation failed: Constraint constraints/compute.vmExternalIpAccess violated for project 13667658525. Add instance projects/pangeo-forge-4967/zones/us-central1-a/instances/gke-pfcsb-cluster-default-pool-5f0b4f10-v4xk to the constraint to use external IP with it
│       (3) Not all instances running in IGM after 33.618018184s. Expected 1, running 0, transitioning 1. Current errors: [CONDITION_NOT_MET]: Instance 'gke-pfcsb-cluster-default-pool-ce7e23ec-87w8' creation failed: Constraint constraints/compute.vmExternalIpAccess violated for project 13667658525. Add instance projects/pangeo-forge-4967/zones/us-central1-b/instances/gke-pfcsb-cluster-default-pool-ce7e23ec-87w8 to the constraint to use external IP with it.
│
│   with google_container_cluster.primary,
│   on cluster.tf line 9, in resource "google_container_cluster" "primary":
│    9: resource "google_container_cluster" "primary" {

I'll confess that I initially thought (and my initially I mean for the past 3 hours or so 🤦) that this had something to do with my laptop's IP address, and tried various ill-conceived workarounds for that, including running make deploy from GCP's in-browser shell; using GCP's graphical in-browser GKE cluster creation tool; and re-trying locally with Columbia's VPN enabled. All methods produced the same error.

I now see (thanks, reddit) that this is a matter of the networking config for the VMs themselves, not the IP address from which their creation is requested. I have checked with @rabernat, who reports that changing constraints/compute.vmExternalIpAccess settings will be a non-starter for Columbia. Therefore, I believe we'll need an alternative wherein VMs do not use external IPs.

@tracetechnical @sharkinsspatial am I correct in assuming that network address translation (NAT) is the way forward with this? If so, what are next steps for re-configuring our terraform to use this approach? If not, what other options do we have?

@rabernat
Copy link

rabernat commented Jan 29, 2022

This is a problem that 2i2c had to deal with when setting up a JupyterHub for a different project. It was a real pain to work around. Pinging @yuvipanda and @sgibson91 for any tips they might have. Maybe there is an easy solution.

What they will probably tell us is that we should try to liberate our project from Columbia's built in constraints. I will try to escalate the issue with the university to see if they will shut off these constraints. But I am not optimistic.

This really highlights the challenges of trying to build cloud stuff quickly under the umbrella of the university.

@sgibson91
Copy link

This is the PR I made to enable private nodes and use cloud NAT for Pangeo's JupyterHub 2i2c-org/infrastructure#538 Hope its helpful!

@cisaacstern
Copy link
Member Author

@sgibson91 thank you so much for sharing this. This looks like exactly what we need. Trying it out now and will report back with the results! 🙌 Open source collaboration ftw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants