Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node repurpose tool docs #40525

Conversation

henningandersen
Copy link
Contributor

Added documentation for node repurpose tool.

Follow-up to #39403

Added documentation for node repurpose tool.
@henningandersen henningandersen added >docs General docs changes v7.0.0 :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.2.0 labels Mar 27, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned earlier, there would be problems with combining documentation for unsafe-bootstrap/detach-cluster and repurpose. For example, when reading the description section there are several paragraphs that talk about unsafe cluster recover, but after that, there is a paragraph about repurpose and just one question arises - what is it all about?
I still think that the best idea is to explicitly have elasticsearch-repurpose tool separately.
But if we don't want to do it, we need to think carefully about grouping description/examples in such a way that do not confuse the reader.
The only idea that comes to my mind is to make too child pages for elasticsearch-node tool - one for unsafe-bootstrap/detach-cluster and another for repurpose. On the parent page, we can leave short general description and synopsis, but everything else comes to separate pages. But still, it looks artificial.
@ywelsch @DaveCTurner do you have better ideas?

@henningandersen
Copy link
Contributor Author

henningandersen commented Mar 28, 2019

Thanks for looking into this @andrershov . I tend to think the tool is quite coherent in that all 3 commands relate to the node's role in the cluster and tries to avoid someone messing directly with files in the data path.
I could probably make that clearer in the documentation, but I was feeling a little constrained by not wanting to rewrite the entire Description section since you all spent a good deal of time ensuring the correctness of it, paying attention to many details in the original node-tool docs PR #37812.
Just my cents, will await feedback from @ywelsch and @DaveCTurner .

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen By intro I mean the paragraph that is already in the documentation

The `elasticsearch-node` command enables you to perform unsafe operations that
risk data loss but which may help to recover some data in a disaster.	

This and synopsis can stay in the parent page - everything else will go to the child pages. And child page for unsafe-bootstrap/detach-cluster will look the same way it was before your changes. All your additions will go to the corresponding sections of repurpose child page.

@henningandersen
Copy link
Contributor Author

@andrershov by intro section, I really meant the Description section, I have updated my comment above to clarify that.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to agree with @andrershov that this change makes these docs feel a bit disjointed. See also this comment with which I agree: eventually I think we want to move the conceptual overview of these things to a place that isn't buried so deeply. I do not think we should split the docs for each sub-command out into separate pages, although it's possible we would want to do this for the conceptual overview.

Yet I think it's important we have this tool documented here for now, and we can defer this larger refactoring until later.

I left some suggestions, mostly about structure/order, and also noted that I think there's something missing on the subject from docs/reference/modules/node.asciidoc.

@@ -2,14 +2,15 @@
== elasticsearch-node

The `elasticsearch-node` command enables you to perform unsafe operations that
risk data loss but which may help to recover some data in a disaster.
risk data loss but which may help to recover some data in a disaster and
repurpose nodes into new roles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest breaking into two simpler sentences:

The `elasticsearch-node` command enables you to perform certain unsafe
operations on a node that are only possible while it is shut down. This command
allows you to adjust the <<modules-node,role>> of a node and may be able to
recover some data after a disaster.

Which reminds me, the docs about roles are at docs/reference/modules/node.asciidoc and I think we should have docs there to say that you now have to take special steps to change the role of a node.


[float]
=== Synopsis

[source,shell]
--------------------------------------------------
bin/elasticsearch-node unsafe-bootstrap|detach-cluster
bin/elasticsearch-node unsafe-bootstrap|detach-cluster|repurpose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to put the repurpose command first here (and elsewhere) because of the three commands it's the one that we hope is the most used.

@@ -53,19 +54,28 @@ way forward that does not risk data loss, but it may be possible to use the
`elasticsearch-node` tool to construct a new cluster that contains some of the
data from the failed cluster.

This tool has two modes:
If a node that previously held data or were master-eligible is repurposed into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest:

  1. moving this to the top of the Description section,
  2. making new subheadings e.g. ==== Changing the roles of a node and ==== Recovering data after a disaster to separate the sections more clearly, and
  3. expanding this paragraph into a longer description of the concepts (and/or describing the concepts in docs/reference/modules/node.asciidoc and linking to them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About item 3: I think master-eligible node and data node are already described in docs/reference/modules/node.asciidoc. I can certainly link to those descriptions. But it sounds like you wanted something more than that, can you elaborate on that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sorry, I'll clarify. Today the Node docs describe the node.data and node.master settings but do not say anything about how to go about changing them. This makes sense in 6.x because you don't have to do anything particularly special to change them, but in 7.0 this is no longer the case. I think that page needs to mention that you should only really set these settings before starting a node for the first time, and also talk about (or at least link to) the extra steps you need to do if you want to change them later.

remaining master-eligible node. It forces one of the remaining nodes to form
a brand-new cluster on its own, using its local copy of the cluster metadata.
This is known as _unsafe cluster bootstrapping_.

* `elastisearch-node detach-cluster` enables you to move nodes from one cluster
* `elasticsearch-node detach-cluster` enables you to move nodes from one cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh :)

to another. This can be used to move nodes into the new cluster created with
the `elastisearch-node unsafe-bootstap` command. If unsafe cluster bootstrapping was not
possible, it also enables you to
move nodes into a brand-new cluster.

* `elasticsearch-node repurpose` reports on and performs necessary index meta-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this mode first.

@@ -206,6 +216,44 @@ The message `Node was successfully detached from the cluster` does not mean
that there has been no data loss, it just means that tool was able to complete
its job.


[[node-tool-repurpose]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this section above the other two.

[float]
==== Repurposing nodes

A master-eligible and data carrying node holds on-disk index data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to try and split this whole section into two parallel descriptions about node.data and node.master respectively, because here it's a bit unclear what this says about a master-ineligible data node or (conversely) a dedicated master node. It might be a little repetitive, but I'm ok with that.

@@ -215,6 +263,8 @@ one-node cluster.
`detach-cluster`:: Specifies to unsafely detach this node from its cluster so
it can join a different cluster.

`repurpose`:: Report on and cleanup excess persisted data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this first.

@@ -332,3 +382,45 @@ Confirm [y/N] y
Node was successfully detached from the cluster
----

[float]
==== Repurposing node as master, no-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this first.

@henningandersen
Copy link
Contributor Author

Thanks @andrershov and @DaveCTurner for your reviews. I tried to adhere to all comments in one commit: 3879859bafcada9cc42ad56095b7c673dc8e0e02 (felt unnatural to split into multiple commits). I did not split the page, but went for two subsections in the Description section.

Please have another look.

Improved documentation for node repurpose tool.
@henningandersen henningandersen force-pushed the feature_node_repurpose_tool_docs branch from 3879859 to a09c826 Compare March 29, 2019 17:01
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few more nits but the structure feels good to me now. I would like to do a bit of wordsmithery with the longer sections of words but will defer that until after the weekend.

@@ -230,6 +284,49 @@ to `0`, meaning to use the first node in the data path.
[float]
=== Examples

[float]
==== Repurposing node as master, no-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
==== Repurposing node as master, no-data
==== Repurposing a data node as a dedicated master node

----

[float]
==== Repurposing node as no-master, no-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
==== Repurposing node as no-master, no-data
==== Repurposing a data node as a coordinating-only node

[float]
==== Repurposing node as no-master, no-data

In this example, a node that previously held data is repurposed as no-master, no-data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this example, a node that previously held data is repurposed as no-master, no-data.
In this example, a node that previously held data is repurposed as a coordinating-only node.

docs/reference/commands/node-tool.asciidoc Outdated Show resolved Hide resolved
[float]
==== Repurposing node as master, no-data

In this example, a node that previously held data is repurposed as master, no-data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this example, a node that previously held data is repurposed as master, no-data.
In this example, a node that previously held data is repurposed as a dedicated master node.

docs/reference/commands/node-tool.asciidoc Outdated Show resolved Hide resolved
DaveCTurner and others added 3 commits March 31, 2019 20:26
Clarifty that changes happen in .yml file.

Co-Authored-By: henningandersen <[email protected]>
Clarify that changes happen in .yml file.

Co-Authored-By: henningandersen <[email protected]>
Changed terminology a bit to use coordinating-only and dedicated master.
Though not entirely true, it gives a slightly easier read.
@henningandersen
Copy link
Contributor Author

Thanks @DaveCTurner , I have addressed the 4 changes, though not identically to what you requested. Let me know if you have additional comments.

@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/1

@DaveCTurner
Copy link
Contributor

@henningandersen I have done a bit of work on this on my own branch (on top of this one): master...DaveCTurner:2019-04-08-feature_node_repurpose_tool_docs

I added more detail about repurposing to the Node page which simplified the tool docs a bit. Could you compare and bring in any changes you like?

@DaveCTurner DaveCTurner dismissed their stale review April 8, 2019 15:11

Comments all addressed, but more changes incoming...

DaveCTurner and others added 2 commits April 9, 2019 10:03
Removed todo and changed metadata to data to also capture shard data.
@henningandersen
Copy link
Contributor Author

I added more detail about repurposing to the Node page which simplified the tool docs a bit. Could you compare and bring in any changes you like?

Thanks @DaveCTurner , I took your changes as is and made two small changes only in 514831d

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

@henningandersen
Copy link
Contributor Author

Thanks @DaveCTurner and @andrershov

@henningandersen henningandersen merged commit 0783efd into elastic:master Apr 9, 2019
henningandersen added a commit that referenced this pull request Apr 9, 2019
Added documentation for node repurpose tool and included documentation on how to repurpose nodes safely. Adjusted order of tools in `elasticsearch-node` tool since the repurpose tool is most likely to be used.

Co-Authored-By: David Turner <[email protected]>
henningandersen added a commit that referenced this pull request Apr 9, 2019
Added documentation for node repurpose tool and included documentation on how to repurpose nodes safely. Adjusted order of tools in `elasticsearch-node` tool since the repurpose tool is most likely to be used.

Co-Authored-By: David Turner <[email protected]>
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 9, 2019
…forced-unsafe-publication

* elastic/master:
  Improve Watcher test framework resiliency (elastic#40658)
  Fix order of request body search parameter names in documentation (elastic#40777)
  Node repurpose tool docs (elastic#40525)
  [Docs] Delete explanation for completion suggester default analyzer choice (elastic#36720)
  Revert "Revert "Change HLRC CCR response tests to use AbstractResponseTestCase base class. (elastic#40257)"" (elastic#40971)
  Short-circuit rebalancing when disabled (elastic#40966)
@polyfractal polyfractal added v7.0.1 and removed v7.0.0 labels Apr 9, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Added documentation for node repurpose tool and included documentation on how to repurpose nodes safely. Adjusted order of tools in `elasticsearch-node` tool since the repurpose tool is most likely to be used.

Co-Authored-By: David Turner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >docs General docs changes v7.0.1 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants