-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rolling upgrade multi cluster test module #38277
Add rolling upgrade multi cluster test module #38277
Conversation
This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and the the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037
Pinging @elastic/es-distributed |
Pinging @elastic/es-core-infra |
This PR is the first step of properly testing ccr during a rolling upgrade. Currently there is only a test that verifies that unidirectional index following works whiling doing a rolling upgrade. As a follow up auto follow patterns should be tested and also bidirection index following should be tested while doing a rolling upgrade (after we decided how we think bi-directional index following should be working during a rolling upgrade). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these tests @martijnvg. I've left some smaller comments, looking very good already.
...ti-cluster/src/test/java/org/elasticsearch/upgrades/AbstractMultiClusterUpgradeTestCase.java
Outdated
Show resolved
Hide resolved
...ling-upgrade-multi-cluster/src/test/java/org/elasticsearch/upgrades/CcrRollingUpgradeIT.java
Outdated
Show resolved
Hide resolved
// At this point all nodes in both clusters have been updated and | ||
// the leader cluster can now will leader_index4 in the follower cluster: | ||
followIndex(leaderClient(), "follower", "leader_index4", "follower_index4"); | ||
assertBusy(() -> verifyTotalHitCount("follower_index4", 64, leaderClient())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps increase timeout on the assertBusy (same for the other ones in this class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think the timeouts need to be increased, because CI workers may be slow?
Locally I have not seen this fail, because there wasn't enough time to replicate the documents from leader to follower index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was worrying about slower CI workers. The assertBusy also waits on timed events e.g. on the internal auto-refresh on the follower index. Perhaps we can do an explicit refresh to speed things up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 yes, that makes sense.
...ling-upgrade-multi-cluster/src/test/java/org/elasticsearch/upgrades/CcrRollingUpgradeIT.java
Show resolved
Hide resolved
ResponseException e = expectThrows(ResponseException.class, | ||
() -> followIndex(leaderClient(), "follower", "leader_index4", "follower_index4")); | ||
assertThat(e.getMessage(), containsString("the snapshot was created with Elasticsearch version [")); | ||
assertThat(e.getMessage(), containsString("] which is higher than the version of this node [")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should improve the exception message here /cc @tbrooks8
@ywelsch I've updated the PR. |
checking total hitcount inside assertBusy(...)
I will let this bake on master for a while before backporting. |
This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
* Add rolling upgrade multi cluster test module (elastic#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to elastic#37231 and elastic#38037 * Filter out upgraded version index settings when starting index following (elastic#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes elastic#38835
* Add rolling upgrade multi cluster test module (#38277) This test starts 2 clusters, each with 3 nodes. First the leader cluster is started and tests are run against it and then the follower cluster is started and tests execute against this two cluster. Then the follower cluster is upgraded, one node at a time. After that the leader cluster is upgraded, one node at a time. Every time a node is upgraded tests are ran while both clusters are online. (and either leader cluster has mixed node versions or the follower cluster) This commit only tests CCR index following, but could be used for CCS tests as well. In particular for CCR, unidirectional index following is tested during a rolling upgrade. During the test several indices are created and followed in the leader cluster before or while the follower cluster is being upgraded. This tests also verifies that attempting to follow an index in the upgraded cluster from the not upgraded cluster fails. After both clusters are upgraded following the index that previously failed should succeed. Relates to #37231 and #38037 * Filter out upgraded version index settings when starting index following (#38838) The `index.version.upgraded` and `index.version.upgraded_string` are likely to be different between leader and follower index. In the event that a follower index gets restored on a upgraded node while the leader index is still on non-upgraded nodes. Closes #38835
This has been backported to 7.x, 7.0 and 6.7 banches. |
This test starts 2 clusters, each with 3 nodes.
First the leader cluster is started and tests are run against it and
the the follower cluster is started and tests execute against this two cluster.
Then the follower cluster is upgraded, one node at a time.
After that the leader cluster is upgraded, one node at a time.
Every time a node is upgraded tests are ran while both clusters are online.
(and either leader cluster has mixed node versions or the follower cluster)
This commit only tests CCR index following, but could be used for CCS tests as well.
In particular for CCR, unidirectional index following is tested during a rolling upgrade.
During the test several indices are created and followed in the leader cluster before or
while the follower cluster is being upgraded.
This tests also verifies that attempting to follow an index in the upgraded cluster
from the not upgraded cluster fails. After both clusters are upgraded following the
index that previously failed should succeed.
Relates to #37231 and #38037