-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand following
documentation in ccr overview
#39936
Expand following
documentation in ccr overview
#39936
Conversation
Pinging @elastic/es-distributed |
Personally I prefer this approach (expanding |
I like this approach indeed. |
I am closing the other PR in favor of this PR. The primary objection to this approach is that it makes the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. I left a few nits, and one comment to think about.
docs/reference/ccr/overview.asciidoc
Outdated
When a follower initiates the index following, it acquires a retention lease from | ||
the leader. This informs the leader that it should not allow a soft delete to be | ||
pruned until either the follower indicates that it has received the operation or | ||
the lease expires after `12 hours`. It is valuable to have monitoring in place to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the formatting on "12 hours" odd. Is there a reason that you chose to format it this way, as opposed to "twelve hours" (without being formatted as code) or 12h
. Another concern I have is if we change the defaults from 12h
to some other value and we would have to maintain all the pages in the docs. Maybe the best option is to add docs for the index.soft_deletes.retention_lease.period
setting (or link to the CCR requirements page) and let the default be only specified there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I linked the requirements page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
docs/reference/ccr/overview.asciidoc
Outdated
@@ -17,7 +17,7 @@ follower index. This simplifies state management on the leader index and means | |||
that {ccr} does not interfere with indexing on the leader index. | |||
|
|||
[float] | |||
=== Configuring replication | |||
=== Initiating replication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about the intended change in meaning here. Is it meant to cover both configuring and starting CCR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so. The actual create follower
and auto follower
actions seem more associated with initiating replication (although you can configure certain things using parameters). That is why I made that change. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find "initiating" a bit unclear, so my vote would be simpler verbs like "configuring" or "setting up", but it's not a deal breaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it back to configuring
@@ -29,6 +29,41 @@ Replication can be configured in two ways: | |||
|
|||
NOTE: You must also <<ccr-requirements,configure the leader index>>. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to point out that all of those tasks can be done via Kibana too and add a link to {kibana-ref}/working-remote-clusters.html#managing-cross-cluster-replication[Managing {ccr}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added an additional note. Let me know if that looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new commit that mentions Kibana directly in the bullet points, since I couldn't add a suggestion on that section.
docs/reference/ccr/overview.asciidoc
Outdated
=== Remedying a follower that has fallen behind | ||
|
||
If a follower falls sufficiently behind a leader that it can no longer replicate | ||
operations this can be detected using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume this exception would be visible in Kibana too?:
operations this can be detected using the | |
operations this can be detected in {kib} or by using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made this change. @cjcenizal a leader falling behind if reported as a fatal exception in the stats api. Is that shown on the Kibana ui?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to tag in @jen-huang and @sebelga to answer your question, since I'm on PTO today and focusing on stack upgrade testing for the rest of the week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tbrooks8 It doesn't appear we show indicies[].fatal_exception
in the UI. I've created a bug ticket for this: elastic/kibana#33628
before the follower falls fatally behind. | ||
|
||
[float] | ||
=== Remedying a follower that has fallen behind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this content could actually be helpful in the Troubleshooting section (i.e. https://www.elastic.co/guide/en/elastic-stack-overview/master/troubleshooting.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a task to the meta issue #35975
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and builds successfully
thanks @lcawl |
This commit expands the ccr overview page to include more information about the lifecycle of following an index. It adds information linking to the remote recovery documentation. And describes how an index can fall-behind and how to fix it when this happens.
This commit expands the ccr overview page to include more information about the lifecycle of following an index. It adds information linking to the remote recovery documentation. And describes how an index can fall-behind and how to fix it when this happens.
This commit expands the ccr overview page to include more information about the lifecycle of following an index. It adds information linking to the remote recovery documentation. And describes how an index can fall-behind and how to fix it when this happens.
This commit expands the ccr
overview
page to include more informationabout the lifecycle of following an index. It adds information linking
to the remote recovery documentation. And describes how an index can
fall-behind and how to fix it when this happens.