Set Cassandra connect timeout, not just regular timeout #12903

ncabatoff · 2021-10-22T14:17:07Z

It looks like Cassandra sessions have both a "connect timeout" that governs timeouts when initiating connections, and a regular timeout used post-initialization. Our connection producer also has a "connect timeout", but only uses it for setting the latter. I think we should use it for both timeouts.

I think this will help our recurring test failures: from what I can see in the Cassandra code, if you don't set a connect timeout explicitly, it will use 600ms as the default, even if you have a regular timeout configured. Consider this set of test runs from https://app.circleci.com/pipelines/github/hashicorp/vault/22034/workflows/f814993d-7326-4ab5-8ced-a088ab2087f2/jobs/313877/parallel-runs/2

PASS plugins/database/cassandra.TestSelfSignedCA/pem_json/bad_ca (0.13s)
PASS plugins/database/cassandra.TestSelfSignedCA/pem_json/missing_ca (0.01s)
PASS plugins/database/cassandra.TestSelfSignedCA/no_cert_data/tls=true (0.01s)
=== RUN   TestSelfSignedCA/no_cert_data/insecure_tls
2021/10/19 20:41:20 gocql: unable to dial control conn 172.18.0.3:9042: gocql: no response to connection startup within timeout
    connection_producer_test.go:170: no error expected, got: failed to initialize: error verifying connection: error creating session: gocql: unable to create session: control: unable to connect to initial hosts: gocql: no response to connection startup within timeout
    --- FAIL: TestSelfSignedCA/no_cert_data/insecure_tls (0.76s)
FAIL plugins/database/cassandra.TestSelfSignedCA/no_cert_data/insecure_tls (0.76s)
=== RUN   TestSelfSignedCA/pem_json/ca_only
2021/10/19 20:41:21 gocql: unable to dial control conn 172.18.0.3:9042: gocql: no response to connection startup within timeout
    connection_producer_test.go:170: no error expected, got: failed to initialize: error verifying connection: error creating session: gocql: unable to create session: control: unable to connect to initial hosts: gocql: no response to connection startup within timeout
    --- FAIL: TestSelfSignedCA/pem_json/ca_only (0.77s)
FAIL plugins/database/cassandra.TestSelfSignedCA/pem_json/ca_only (0.77s)
PASS plugins/database/cassandra.TestSelfSignedCA/pem_bundle/ca_only (2.12s)

These tests aren't using t.Parallel, so I believe this is all happening in sequence, all against the same Cassandra docker instance. We have failures bracketed by successes, suggesting that the container came up fine, we just weren't always able to connect to it within 600ms.

…overns timeouts when initiating connections, and a regular timeout used post-initialization. Our connection producer also has a "connect timeout", but only uses it for setting the latter. I think we should use it for both timeouts.

benashz · 2021-10-22T14:26:27Z

plugins/database/cassandra/connection_producer.go

@@ -189,6 +189,7 @@ func (c *cassandraConnectionProducer) createSession(ctx context.Context) (*gocql
 	}

 	clusterConfig.Timeout = c.connectTimeout
+	clusterConfig.ConnectTimeout = c.connectTimeout


Seems reasonable to me. I see that ClusterConfig.Timeout and ClusterConfig.ConnectTimeout are set to the same value in NewCluster().

I suppose we could easily decouple this in the future.

ncabatoff requested a review from a team October 22, 2021 14:17

benashz reviewed Oct 22, 2021

View reviewed changes

benashz approved these changes Oct 22, 2021

View reviewed changes

Add CL and update docs.

66e6b67

ncabatoff requested a review from taoism4504 as a code owner October 22, 2021 14:34

vercel bot deployed to Preview – vault October 22, 2021 14:34 View deployment

vercel bot temporarily deployed to Preview – vault-storybook October 22, 2021 14:34 Inactive

ncabatoff merged commit 8f7dafe into main Oct 22, 2021

ncabatoff deleted the fix-cassandra-test-timeouts branch October 22, 2021 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set Cassandra connect timeout, not just regular timeout #12903

Set Cassandra connect timeout, not just regular timeout #12903

ncabatoff commented Oct 22, 2021 •

edited

Loading

benashz Oct 22, 2021

Set Cassandra connect timeout, not just regular timeout #12903

Set Cassandra connect timeout, not just regular timeout #12903

Conversation

ncabatoff commented Oct 22, 2021 • edited Loading

benashz Oct 22, 2021

Choose a reason for hiding this comment

ncabatoff commented Oct 22, 2021 •

edited

Loading