Configure Cassandra nodes to lookup their IP address by their fully qualified domain name #334

wallrj · 2018-04-14T00:46:37Z

Also ensure that they only ever use the Kubernetes seed provider service
And ensure that that service publishes the seed IP addresses as soon as they are known,
so that seeds them selves can find themselves when they are starting up.

This is a cut down version of #330

without the headless service per statefulset, which I think may be unnecessary, since nodes only have to be able to lookup their own FQDN.
and without the E2E tests, because I haven't yet figured out how to force minikube to change the IP address of a pod when it's recreated.

Fixes: #319

Release note:

NONE

munnerz · 2018-04-17T16:08:12Z

and without the E2E tests, because I haven't yet figured out how to force minikube to change the IP address of a pod when it's recreated.

When a pod is deleted, in 99% of cases it should have a new IP address. Can we make a test that deletes the pod and verifies the IP has changed? (perhaps we delete the pod again if it hasn't changed first time?).

We definitely need an e2e test for this - ideally the test would be labelled "Resiliency" too, as the first of a suite of resiliency tests.

munnerz · 2018-04-17T16:10:41Z

pkg/cassandra/nodetool/nodetool.go

-			leavingNodes, joiningNodes, movingNodes, mappedNodes,
-		)
-	}
-


I'm confused - isn't this meant to be a part of another PR?

If yes and this is just stacked, it'd be great if we can stop stacking PRs too much as it makes it far harder to know which to review first. If we keep them separate, PRs that depend on [some other] PR will fail and/or we can add /hold and comment to inform a reviewer that "this PR depends on PR xyz being merged first"

munnerz · 2018-04-17T16:11:39Z

pkg/controllers/cassandra/nodepool/resource.go

+								{
+									Name:  "CASSANDRA_RPC_ADDRESS",
+									Value: " ",
+								},


Do we use the Cassandra RPC interface, and do we want to listen on the default pod IP for RPC connections? It seems to me like we'd only want to listen on 127.0.0.1?

We don't use it, but I'm not sure whether Cassandra nodes talk to each other with this protocol.
The default is to listen on all addresses:

https://github.com/docker-library/cassandra/blob/master/3.11/docker-entrypoint.sh#L29

So setting it here may not be ideal, but I can improve or remove it later.

munnerz · 2018-04-17T16:12:47Z

pkg/controllers/cassandra/nodepool/resource.go

+								// Set a non-existent default seed.
+								// The Kubernetes Seed Provider will fall back to a default seed host if it can't look up seeds via the CASSANDRA_SERVICE.
+								// And if the CASSANDRA_SEEDS environment variable is not set, it defaults to localhost.
+								// Which could cause confusion if a non-seed node is temporarily unable to lookup the seed nodes from the service.


What would happen if we were to set this to localhost instead? Would the node fail to boot, or would it cause a split brain of some sort?

I think it would mean that new nodes might consider themselves seeds in the event that the KubernetesSeedProvider plugin encounters an error.
So yeah, a split brain situation I think.

munnerz · 2018-04-17T16:13:53Z

pkg/controllers/cassandra/nodepool/resource.go

+								// https://github.com/kubernetes/examples/blob/cabf8b8e4739e576837111e156763d19a64a3591/cassandra/go/main.go#L51
+								{
+									Name:  "CASSANDRA_SEEDS",
+									Value: "black-hole-dns-name",


I'm slightly concerned this could become an attack vector. What if someone creates a service named black-hole-dns-name? Could an attacker gain access to the contents of the Cassandra DB as a result?

If possible, I'd rather set this to a value that causes Cassandra to fail hard/crash (as this value should only be referenced in the event of a DNS resolution failure)

I think the answer is to remove the fallback code from the Kubernetes Seed Provider.
If it can't lookup the seeds from the Kubernetes it should log an error and return an empty list,
rather than returning a default.

I'll do this in a follow up branch, where I remove the reliance on the Docker image entrypoint and instead create a Navigator default config.

…ualified domain name * Also ensure that they only ever use the Kubernetes seed provider service * And ensure that that service publishes the seed IP addresses as soon as they are known, * so that seeds them selves can find themselves when they are starting up. Fixes: jetstack#319

wallrj · 2018-05-08T10:11:31Z

/retest

wallrj

I rebased and answered your comments @munnerz

wallrj · 2018-05-08T10:28:38Z

pkg/controllers/cassandra/nodepool/resource.go

+								// Set a non-existent default seed.
+								// The Kubernetes Seed Provider will fall back to a default seed host if it can't look up seeds via the CASSANDRA_SERVICE.
+								// And if the CASSANDRA_SEEDS environment variable is not set, it defaults to localhost.
+								// Which could cause confusion if a non-seed node is temporarily unable to lookup the seed nodes from the service.


I think it would mean that new nodes might consider themselves seeds in the event that the KubernetesSeedProvider plugin encounters an error.
So yeah, a split brain situation I think.

wallrj · 2018-05-08T10:29:17Z

pkg/controllers/cassandra/nodepool/resource.go

+								// https://github.com/kubernetes/examples/blob/cabf8b8e4739e576837111e156763d19a64a3591/cassandra/go/main.go#L51
+								{
+									Name:  "CASSANDRA_SEEDS",
+									Value: "black-hole-dns-name",


I'll do this in a follow up branch, where I remove the reliance on the Docker image entrypoint and instead create a Navigator default config.

munnerz · 2018-05-08T10:35:54Z

/lgtm
/approve

jetstack-bot · 2018-05-08T10:36:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [munnerz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jetstack-bot added the release-note-none label Apr 14, 2018

jetstack-bot requested a review from munnerz April 14, 2018 00:46

jetstack-bot added the size/M label Apr 14, 2018

munnerz mentioned this pull request Apr 17, 2018

WIP: Use hostnames rather than IP addresses for cassandra nodes #330

Closed

munnerz reviewed Apr 17, 2018

View reviewed changes

jetstack-bot added the needs-rebase label Apr 25, 2018

wallrj force-pushed the 319-cassandra-hostnames-simple branch from 726cd02 to 310bc07 Compare May 8, 2018 09:20

jetstack-bot removed the needs-rebase label May 8, 2018

munnerz added the stability label May 8, 2018

wallrj commented May 8, 2018

View reviewed changes

jetstack-bot assigned munnerz May 8, 2018

jetstack-bot added the lgtm label May 8, 2018

jetstack-bot added the approved label May 8, 2018

jetstack-bot merged commit 332b3c6 into jetstack:master May 8, 2018

This was referenced May 8, 2018

Should be able to use DNS tools to resolve the IP address of Cassandra nodes #348

Open

CassandraCluster node IP address changes are not currently tested #349

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure Cassandra nodes to lookup their IP address by their fully qualified domain name #334

Configure Cassandra nodes to lookup their IP address by their fully qualified domain name #334

wallrj commented Apr 14, 2018

munnerz commented Apr 17, 2018

munnerz Apr 17, 2018

munnerz Apr 17, 2018

wallrj May 8, 2018

munnerz May 8, 2018

munnerz Apr 17, 2018

wallrj May 8, 2018

munnerz Apr 17, 2018

wallrj Apr 27, 2018

wallrj May 8, 2018

wallrj commented May 8, 2018

wallrj left a comment

wallrj May 8, 2018

wallrj May 8, 2018

munnerz commented May 8, 2018

jetstack-bot commented May 8, 2018

Configure Cassandra nodes to lookup their IP address by their fully qualified domain name #334

Configure Cassandra nodes to lookup their IP address by their fully qualified domain name #334

Conversation

wallrj commented Apr 14, 2018

munnerz commented Apr 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wallrj commented May 8, 2018

wallrj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

munnerz commented May 8, 2018

jetstack-bot commented May 8, 2018