Skip to content

Commit

Permalink
deploy using tiup: align three PRs (pingcap#8603)
Browse files Browse the repository at this point in the history
  • Loading branch information
shichun-0415 authored and ti-chi-bot committed Jun 15, 2022
1 parent 03ae6fc commit 18d2d41
Show file tree
Hide file tree
Showing 7 changed files with 212 additions and 163 deletions.
2 changes: 1 addition & 1 deletion clinic/clinic-user-guide-for-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Before using PingCAP Clinic, you need to install Diag (a component to collect da

> **Note:**
>
> - For clusters without an internet connection, you need to deploy Diag offline. For details, refer to [Deploy TiUP offline: Method 2](/production-deployment-using-tiup.md#method-2-deploy-tiup-offline).
> - For clusters without an internet connection, you need to deploy Diag offline. For details, refer to [Deploy TiUP offline: Method 2](/production-deployment-using-tiup.md#deploy-tiup-offline).
> - Diag is **only** provided in the TiDB Server offline mirror package of v5.4.0 or later.

2. Get and set an access token (token) to upload data.
Expand Down
2 changes: 1 addition & 1 deletion hardware-and-software-requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Other Linux OS versions such as Debian Linux and Fedora Linux might work but are

> **Note:**
>
> It is required that you [deploy TiUP on the control machine](/production-deployment-using-tiup.md#step-2-install-tiup-on-the-control-machine) to operate and manage TiDB clusters.
> It is required that you [deploy TiUP on the control machine](/production-deployment-using-tiup.md#step-2-deploy-tiup-on-the-control-machine) to operate and manage TiDB clusters.
### Target machines

Expand Down
2 changes: 1 addition & 1 deletion migration-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,5 +101,5 @@ tiup update --self && tiup update dm

## See also

- [Deploy TiUP offline](/production-deployment-using-tiup.md#method-2-deploy-tiup-offline)
- [Deploy TiUP offline](/production-deployment-using-tiup.md#deploy-tiup-offline)
- [Download and install tools in binary](/download-ecosystem-tools.md)
212 changes: 116 additions & 96 deletions production-deployment-using-tiup.md

Large diffs are not rendered by default.

151 changes: 90 additions & 61 deletions scale-tidb-using-tiup.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: Scale the TiDB Cluster Using TiUP
title: Scale a TiDB Cluster Using TiUP
summary: Learn how to scale the TiDB cluster using TiUP.
---

# Scale the TiDB Cluster Using TiUP
# Scale a TiDB Cluster Using TiUP

The capacity of a TiDB cluster can be increased or decreased without interrupting the online services.

This document describes how to scale the TiDB, TiKV, PD, TiCDC, or TiFlash cluster using TiUP. If you have not installed TiUP, refer to the steps in [Install TiUP on the control machine](/production-deployment-using-tiup.md#step-2-install-tiup-on-the-control-machine).
This document describes how to scale the TiDB, TiKV, PD, TiCDC, or TiFlash cluster using TiUP. If you have not installed TiUP, refer to the steps in [Step 2. Deploy TiUP on the control machine](/production-deployment-using-tiup.md#step-2-deploy-tiup-on-the-control-machine).

To view the current cluster name list, run `tiup cluster list`.

Expand All @@ -23,19 +23,19 @@ For example, if the original topology of the cluster is as follows:

## Scale out a TiDB/PD/TiKV cluster

If you want to add a TiDB node to the `10.0.1.5` host, take the following steps.
This section exemplifies how to add a TiDB node to the `10.0.1.5` host.

> **Note:**
>
> You can take similar steps to add the PD node. Before you add the TiKV node, it is recommended that you adjust the PD scheduling parameters in advance according to the cluster load.
> You can take similar steps to add a PD node. Before you add a TiKV node, it is recommended that you adjust the PD scheduling parameters in advance according to the cluster load.
1. Configure the scale-out topology:

> **Note:**
>
> * The port and directory information is not required by default.
> * If multiple instances are deployed on a single machine, you need to allocate different ports and directories for them. If the ports or directories have conflicts, you will receive a notification during deployment or scaling.
> * Since TiUP v1.0.0, the scale-out configuration will inherit the global configuration of the original cluster.
> * Since TiUP v1.0.0, the scale-out configuration inherits the global configuration of the original cluster.
Add the scale-out topology configuration in the `scale-out.yaml` file:

Expand Down Expand Up @@ -90,29 +90,41 @@ If you want to add a TiDB node to the `10.0.1.5` host, take the following steps.

To view the configuration of the current cluster, run `tiup cluster edit-config <cluster-name>`. Because the parameter configuration of `global` and `server_configs` is inherited by `scale-out.yaml` and thus also takes effect in `scale-out.yaml`.

After the configuration, the current topology of the cluster is as follows:
2. Run the scale-out command:

| Host IP | Service |
|:---|:----|
| 10.0.1.3 | TiDB + TiFlash |
| 10.0.1.4 | TiDB + PD |
| 10.0.1.5 | **TiDB** + TiKV + Monitor |
| 10.0.1.1 | TiKV |
| 10.0.1.2 | TiKV |
Before you run the `scale-out` command, use the `check` and `check --apply` commands to detect and automatically repair potential risks in the cluster:

2. Run the scale-out command:
1. Check for potential risks:

{{< copyable "shell-regular" >}}
{{< copyable "shell-regular" >}}

```shell
tiup cluster scale-out <cluster-name> scale-out.yaml
```
```shell
tiup cluster check <cluster-name> scale-out.yaml --cluster --user root [-p] [-i /home/root/.ssh/gcp_rsa]
```

> **Note:**
>
> The command above is based on the assumption that the mutual trust has been configured for the user to execute the command and the new machine. If the mutual trust cannot be configured, use the `-p` option to enter the password of the new machine, or use the `-i` option to specify the private key file.
2. Enable automatic repair:

If you see the `Scaled cluster <cluster-name> out successfully`, the scale-out operation is successfully completed.
{{< copyable "shell-regular" >}}

```shell
tiup cluster check <cluster-name> scale-out.yaml --cluster --apply --user root [-p] [-i /home/root/.ssh/gcp_rsa]
```

3. Run the `scale-out` command:

{{< copyable "shell-regular" >}}

```shell
tiup cluster scale-out <cluster-name> scale-out.yaml [-p] [-i /home/root/.ssh/gcp_rsa]
```

In the preceding commands:

- `scale-out.yaml` is the scale-out configuration file.
- `--user root` indicates logging in to the target machine as the `root` user to complete the cluster scale out. The `root` user is expected to have `ssh` and `sudo` privileges to the target machine. Alternatively, you can use other users with `ssh` and `sudo` privileges to complete the deployment.
- `[-i]` and `[-p]` are optional. If you have configured login to the target machine without password, these parameters are not required. If not, choose one of the two parameters. `[-i]` is the private key of the root user (or other users specified by `--user`) that has access to the target machine. `[-p]` is used to input the user password interactively.

If you see `Scaled cluster <cluster-name> out successfully`, the scale-out operation succeeds.

3. Check the cluster status:

Expand All @@ -136,14 +148,14 @@ After the scale-out, the cluster topology is as follows:

## Scale out a TiFlash cluster

If you want to add a TiFlash node to the `10.0.1.4` host, take the following steps.
This section exemplifies how to add a TiFlash node to the `10.0.1.4` host.

> **Note:**
>
> When adding a TiFlash node to an existing TiDB cluster, you need to note the following things:
> When adding a TiFlash node to an existing TiDB cluster, note the following:
>
> 1. Confirm that the current TiDB version supports using TiFlash. Otherwise, upgrade your TiDB cluster to v5.0 or later versions.
> 2. Execute the `tiup ctl:<cluster-version> pd -u http://<pd_ip>:<pd_port> config set enable-placement-rules true` command to enable the Placement Rules feature. Or execute the corresponding command in [pd-ctl](/pd-control.md).
> - Confirm that the current TiDB version supports using TiFlash. Otherwise, upgrade your TiDB cluster to v5.0 or later versions.
> - Run the `tiup ctl:<cluster-version> pd -u http://<pd_ip>:<pd_port> config set enable-placement-rules true` command to enable the Placement Rules feature. Or run the corresponding command in [pd-ctl](/pd-control.md).

1. Add the node information to the `scale-out.yaml` file:

Expand All @@ -153,10 +165,10 @@ If you want to add a TiFlash node to the `10.0.1.4` host, take the following ste

```ini
tiflash_servers:
- host: 10.0.1.4
- host: 10.0.1.4
```

Currently, you can only add IP but not domain name.
Currently, you can only add IP addresses but not domain names.

2. Run the scale-out command:

Expand All @@ -168,7 +180,7 @@ If you want to add a TiFlash node to the `10.0.1.4` host, take the following ste

> **Note:**
>
> The command above is based on the assumption that the mutual trust has been configured for the user to execute the command and the new machine. If the mutual trust cannot be configured, use the `-p` option to enter the password of the new machine, or use the `-i` option to specify the private key file.
> The preceding command is based on the assumption that the mutual trust has been configured for the user to run the command and the new machine. If the mutual trust cannot be configured, use the `-p` option to enter the password of the new machine, or use the `-i` option to specify the private key file.

3. View the cluster status:

Expand All @@ -192,7 +204,7 @@ After the scale-out, the cluster topology is as follows:

## Scale out a TiCDC cluster

If you want to add two TiCDC nodes to the `10.0.1.3` and `10.0.1.4` hosts, take the following steps.
This section exemplifies how to add two TiCDC nodes to the `10.0.1.3` and `10.0.1.4` hosts.

1. Add the node information to the `scale-out.yaml` file:

Expand Down Expand Up @@ -220,7 +232,7 @@ If you want to add two TiCDC nodes to the `10.0.1.3` and `10.0.1.4` hosts, take

> **Note:**
>
> The command above is based on the assumption that the mutual trust has been configured for the user to execute the command and the new machine. If the mutual trust cannot be configured, use the `-p` option to enter the password of the new machine, or use the `-i` option to specify the private key file.
> The preceding command is based on the assumption that the mutual trust has been configured for the user to run the command and the new machine. If the mutual trust cannot be configured, use the `-p` option to enter the password of the new machine, or use the `-i` option to specify the private key file.

3. View the cluster status:

Expand All @@ -244,16 +256,13 @@ After the scale-out, the cluster topology is as follows:

## Scale in a TiDB/PD/TiKV cluster

If you want to remove a TiKV node from the `10.0.1.5` host, take the following steps.
This section exemplifies how to remove a TiKV node from the `10.0.1.5` host.

> **Note:**
>
> - You can take similar steps to remove the TiDB and PD node.
> - You can take similar steps to remove a TiDB or PD node.
> - Because the TiKV, TiFlash, and TiDB Binlog components are taken offline asynchronously and the stopping process takes a long time, TiUP takes them offline in different methods. For details, see [Particular handling of components' offline process](/tiup/tiup-component-cluster-scale-in.md#particular-handling-of-components-offline-process).
> **Note:**
>
> The PD Client in TiKV caches the list of PD nodes. The current version of TiKV has a mechanism to automatically and regularly update PD nodes, which can help mitigate the issue of an expired list of PD nodes cached by TiKV. However, after scaling out PD, you should try to avoid directly removing all PD nodes at once that exist before the scaling. If necessary, before making all the previously existing PD nodes offline, make sure to switch the PD leader to a newly added PD node.
> - The PD Client in TiKV caches the list of PD nodes. The current version of TiKV has a mechanism to automatically and regularly update PD nodes, which can help mitigate the issue of an expired list of PD nodes cached by TiKV. However, after scaling out PD, you should try to avoid directly removing all PD nodes at once that exist before the scaling. If necessary, before making all the previously existing PD nodes offline, make sure to switch the PD leader to a newly added PD node.
1. View the node ID information:
Expand Down Expand Up @@ -295,20 +304,20 @@ If you want to remove a TiKV node from the `10.0.1.5` host, take the following s
The `--node` parameter is the ID of the node to be taken offline.
If you see the `Scaled cluster <cluster-name> in successfully`, the scale-in operation is successfully completed.
If you see `Scaled cluster <cluster-name> in successfully`, the scale-in operation succeeds.
3. Check the cluster status:
The scale-in process takes some time. If the status of the node to be scaled in becomes `Tombstone`, that means the scale-in operation is successful.
To check the scale-in status, run the following command:
The scale-in process takes some time. You can run the following command to check the scale-in status:
{{< copyable "shell-regular" >}}
```shell
tiup cluster display <cluster-name>
```
If the node to be scaled in becomes `Tombstone`, the scale-in operation succeeds.
Access the monitoring platform at <http://10.0.1.5:3000> using your browser, and view the status of the cluster.
The current topology is as follows:
Expand All @@ -323,29 +332,29 @@ The current topology is as follows:
## Scale in a TiFlash cluster
If you want to remove a TiFlash node from the `10.0.1.4` host, take the following steps.
This section exemplifies how to remove a TiFlash node from the `10.0.1.4` host.
### 1. Adjust the number of replicas of the tables according to the number of remaining TiFlash nodes
Before the node goes down, make sure that the number of remaining nodes in the TiFlash cluster is no smaller than the maximum number of replicas of all tables. Otherwise, modify the number of TiFlash replicas of the related tables.
1. For all tables whose replicas are greater than the number of remaining TiFlash nodes in the cluster, execute the following command in the TiDB client:
1. For all tables whose replicas are greater than the number of remaining TiFlash nodes in the cluster, run the following command in the TiDB client:
{{< copyable "sql" >}}
```sql
alter table <db-name>.<table-name> set tiflash replica 0;
ALTER TABLE <db-name>.<table-name> SET tiflash replica 0;
```
2. Wait for the TiFlash replicas of the related tables to be deleted. [Check the table replication progress](/tiflash/use-tiflash.md#check-replication-progress) and the replicas are deleted if the replication information of the related tables is not found.
### 2. Perform the scale-in operation
Next, perform the scale-in operation with one of the following solutions.
Perform the scale-in operation with one of the following solutions.
#### Solution 1: Use TiUP to remove a TiFlash node
#### Solution 1. Use TiUP to remove a TiFlash node
1. First, confirm the name of the node to be taken down:
1. Confirm the name of the node to be taken down:
{{< copyable "shell-regular" >}}
Expand All @@ -361,7 +370,7 @@ Next, perform the scale-in operation with one of the following solutions.
tiup cluster scale-in <cluster-name> --node 10.0.1.4:9000
```
#### Solution 2: Manually remove a TiFlash node
#### Solution 2. Manually remove a TiFlash node
In special cases (such as when a node needs to be forcibly taken down), or if the TiUP scale-in operation fails, you can manually remove a TiFlash node with the following steps.
Expand All @@ -371,15 +380,15 @@ In special cases (such as when a node needs to be forcibly taken down), or if th
* If you use TiUP deployment, replace `pd-ctl` with `tiup ctl pd`:
{{< copyable "shell-regular" >}}
{{< copyable "shell-regular" >}}
```shell
tiup ctl:<cluster-version> pd -u http://<pd_ip>:<pd_port> store
```
```shell
tiup ctl:<cluster-version> pd -u http://<pd_ip>:<pd_port> store
```
> **Note:**
>
> If multiple PD instances exist in the cluster, you only need to specify the IP address:port of an active PD instance in the above command.
> **Note:**
>
> If multiple PD instances exist in the cluster, you only need to specify the IP address:port of an active PD instance in the above command.
2. Remove the TiFlash node in pd-ctl:
Expand All @@ -393,13 +402,13 @@ In special cases (such as when a node needs to be forcibly taken down), or if th
tiup ctl:<cluster-version> pd -u http://<pd_ip>:<pd_port> store delete <store_id>
```
> **Note:**
>
> If multiple PD instances exist in the cluster, you only need to specify the IP address:port of an active PD instance in the above command.
> **Note:**
>
> If multiple PD instances exist in the cluster, you only need to specify the IP address:port of an active PD instance in the above command.
3. Wait for the store of the TiFlash node to disappear or for the `state_name` to become `Tombstone` before you stop the TiFlash process.
4. Manually delete TiFlash data files (whose location can be found in the `data_dir` directory under the TiFlash configuration of the cluster topology file).
4. Manually delete TiFlash data files (the location can be found in the `data_dir` directory under the TiFlash configuration of the cluster topology file).
5. Manually update TiUP's cluster configuration file (delete the information of the TiFlash node that goes down in edit mode).

Expand Down Expand Up @@ -454,9 +463,29 @@ The steps to manually clean up the replication rules in PD are below:
curl -v -X DELETE http://<pd_ip>:<pd_port>/pd/api/v1/config/rule/tiflash/table-45-r
```

3. View the cluster status:

{{< copyable "shell-regular" >}}

```shell
tiup cluster display <cluster-name>
```

Access the monitoring platform at <http://10.0.1.5:3000> using your browser, and view the status of the cluster and the new nodes.

After the scale-out, the cluster topology is as follows:

| Host IP | Service |
|:----|:----|
| 10.0.1.3 | TiDB + TiFlash + TiCDC |
| 10.0.1.4 | TiDB + PD + TiCDC **(TiFlash is deleted)** |
| 10.0.1.5 | TiDB+ Monitor |
| 10.0.1.1 | TiKV |
| 10.0.1.2 | TiKV |

## Scale in a TiCDC cluster

If you want to remove the TiCDC node from the `10.0.1.4` host, take the following steps:
This section exemplifies how to remove the TiCDC node from the `10.0.1.4` host.

1. Take the node offline:

Expand Down
4 changes: 2 additions & 2 deletions tiup/tiup-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ For the PD component, `|L` or `|UI` might be appended to `Up` or `Down`. `|L` in

> **Note:**
>
> This section describes only the syntax of the scale-in command. For detailed steps of online scaling, refer to [Scale the TiDB Cluster Using TiUP](/scale-tidb-using-tiup.md).
> This section describes only the syntax of the scale-in command. For detailed steps of online scaling, refer to [Scale a TiDB Cluster Using TiUP](/scale-tidb-using-tiup.md).
Scaling in a cluster means making some node(s) offline. This operation removes the specific node(s) from the cluster and deletes the remaining files.

Expand Down Expand Up @@ -288,7 +288,7 @@ After PD schedules the data on the node to other TiKV nodes, this node will be d

> **Note:**
>
> This section describes only the syntax of the scale-out command. For detailed steps of online scaling, refer to [Scale the TiDB Cluster Using TiUP](/scale-tidb-using-tiup.md).
> This section describes only the syntax of the scale-out command. For detailed steps of online scaling, refer to [Scale a TiDB Cluster Using TiUP](/scale-tidb-using-tiup.md).
The scale-out operation has an inner logic similar to that of deployment: the TiUP cluster component firstly ensures the SSH connection of the node, creates the required directories on the target node, then executes the deployment operation, and starts the node service.

Expand Down
Loading

0 comments on commit 18d2d41

Please sign in to comment.