From 33aaf6d78cfa9743ed7848f6b71f1a4bb25b0afb Mon Sep 17 00:00:00 2001 From: David Turner <david.turner@elastic.co> Date: Wed, 20 Mar 2019 15:49:50 +0000 Subject: [PATCH] Document the reduction in fault detection timeouts (#40200) The new cluster coordination subsystem introduced in 7.0 will only keep an unresponsive node in the cluster for 30 seconds, whereas in earlier versions it might have remained in the cluster for 90 seconds. This commit adds a note to the migration documentation to that effect. --- .../reference/migration/migrate_7_0/discovery.asciidoc | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/reference/migration/migrate_7_0/discovery.asciidoc b/docs/reference/migration/migrate_7_0/discovery.asciidoc index 56449625246cd..a187081117e2a 100644 --- a/docs/reference/migration/migrate_7_0/discovery.asciidoc +++ b/docs/reference/migration/migrate_7_0/discovery.asciidoc @@ -46,3 +46,13 @@ The `discovery.zen.no_master_block` setting is now known as `cluster.no_master_block`. Any value set for `discovery.zen.no_master_block` is now ignored. You should remove this setting and, if needed, set `cluster.no_master_block` appropriately after the upgrade. + +[float] +==== Reduced default timeouts for fault detection + +By default the <<cluster-fault-detection,cluster fault detection>> subsystem +now considers a node to be faulty if it fails to respond to 3 consecutive +pings, each of which times out after 10 seconds. Thus a node that is +unresponsive for longer than 30 seconds is liable to be removed from the +cluster. Previously the default timeout for each ping was 30 seconds, so that +an unresponsive node might be kept in the cluster for over 90 seconds.