-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add minimal docs around upgrading clusters with ccr enabled #38037
Conversation
Pinging @elastic/es-distributed |
This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and the the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037
This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037
This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
* Add rolling upgrade multi cluster test module (elastic#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037 * Filter out upgraded version index settings when starting index following (elastic#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes elastic#38835
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
Follow index in follow cluster that follows an index in the leader cluster and another follow index in the leader index that follows that index in the follow cluster. During the upgrade index following is paused and after the upgrade index following is resumed and then verified index following works as expected. Relates to elastic#38037
Follow index in follow cluster that follows an index in the leader cluster and another follow index in the leader index that follows that index in the follow cluster. During the upgrade index following is paused and after the upgrade index following is resumed and then verified index following works as expected. Relates to #38037
Follow index in follow cluster that follows an index in the leader cluster and another follow index in the leader index that follows that index in the follow cluster. During the upgrade index following is paused and after the upgrade index following is resumed and then verified index following works as expected. Relates to #38037
Follow index in follow cluster that follows an index in the leader cluster and another follow index in the leader index that follows that index in the follow cluster. During the upgrade index following is paused and after the upgrade index following is resumed and then verified index following works as expected. Relates to #38037
Follow index in follow cluster that follows an index in the leader cluster and another follow index in the leader index that follows that index in the follow cluster. During the upgrade index following is paused and after the upgrade index following is resumed and then verified index following works as expected. Relates to #38037
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up, and sorry for the delay in reviewing @martijnvg. I left some comments.
docs/reference/ccr/overview.asciidoc
Outdated
finally the first cluster. | ||
|
||
[float] | ||
==== Bidirectional index following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Bidirectional/Bi-directional
docs/reference/ccr/overview.asciidoc
Outdated
[float] | ||
==== Bidirectional index following | ||
|
||
In this kind of setup clusters index following happens in multiple clusters and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a bi-directional setup between two clusters, each cluster contains both leader and follower indices.
docs/reference/ccr/overview.asciidoc
Outdated
In this kind of setup clusters index following happens in multiple clusters and | ||
so each cluster contains both leader and follower indices. | ||
|
||
When upgrading clusters in this setup, index following needs to be paused prior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/index following/all index following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps link to the pause API
?
docs/reference/ccr/overview.asciidoc
Outdated
so each cluster contains both leader and follower indices. | ||
|
||
When upgrading clusters in this setup, index following needs to be paused prior | ||
to upgrading all the clusters. After all clusters have been upgraded then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/all the/both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/After all/After both
docs/reference/ccr/overview.asciidoc
Outdated
|
||
When upgrading clusters in this setup, index following needs to be paused prior | ||
to upgrading all the clusters. After all clusters have been upgraded then | ||
index following can be resumed. Pausing index following is required, otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps link to the resume API
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the last sentence can be dropped.
docs/reference/ccr/overview.asciidoc
Outdated
Otherwise index following may fail during a rolling upgrade, because of the following reasons: | ||
|
||
* If a new index setting or mapping type is replicated from an upgraded cluster | ||
to a not upgraded cluster then the upgraded cluster will reject that and will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/not upgraded/non-upgraded
s/the upgraded cluster/the non-upgraded cluster
docs/reference/ccr/overview.asciidoc
Outdated
to a not upgraded cluster then the upgraded cluster will reject that and will | ||
fail index following. | ||
* Lucene is not forwards compatible and when index following is falling back to | ||
file based recovery then a node in a not upgraded cluster will reject index files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/not upgraded/non-upgraded
docs/reference/ccr/overview.asciidoc
Outdated
file based recovery then a node in a not upgraded cluster will reject index files | ||
from a newer Lucene version compared to what it is using. | ||
|
||
Rolling upgrading clusters with CCR is different in case of unidirectional index following and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/unidirectional/uni-directional
docs/reference/ccr/overview.asciidoc
Outdated
from a newer Lucene version compared to what it is using. | ||
|
||
Rolling upgrading clusters with CCR is different in case of unidirectional index following and | ||
bidirectional index following. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/bidirectional/bi-directional
docs/reference/ccr/overview.asciidoc
Outdated
bidirectional index following. | ||
|
||
[float] | ||
==== Unidirectional index following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Unidirectional/Uni-directional
Thanks for reviewing @jasontedor. I've done the smaller changes in the first commit and the moving the upgrade docs into a separate page in the second commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one more comment.
For example if there is a first cluster that contains all leader indices, | ||
a second cluster that follows indices in the first cluster and a third | ||
cluster that follows indices in the second clusters. In this case the | ||
third cluster should be upgraded first, then the second cluster and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should avoid referring to the clusters as "first", "second", and "third" only to turn around and say "third cluster [...] first", that's confusing. How about A
, B
, and C
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed: a91bbda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two more comments, sorry I should have noticed this on the first round.
[[ccr-upgrading]] | ||
== Upgrading clusters | ||
|
||
In case of upgrading clusters, clusters that are actively using CCR, require a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have noticed this before. Let's use {ccr}
. Let's reword this first sentence to: Clusters that are actively using {ccr} require a careful approach to upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed: e7c6f01
file based recovery then a node in a non-upgraded cluster will reject index | ||
files from a newer Lucene version compared to what it is using. | ||
|
||
Rolling upgrading clusters with CCR is different in case of uni-directional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, {ccr}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed: e7c6f01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left a minor nit, no need for another round.
using the {ref}/ccr-post-pause-follow.html[pause follower API] prior to | ||
upgrading both clusters. After both clusters have been upgraded then index | ||
following can be resumed using the | ||
{ref}/ccr-post-resume-follow.html[resume follower API]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you end with a newline here?
Thanks @martijnvg! |
No description provided.