Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[close #227] Update BR readme #228

Merged
merged 10 commits into from
Sep 21, 2022
Merged

[close #227] Update BR readme #228

merged 10 commits into from
Sep 21, 2022

Conversation

haojinming
Copy link
Contributor

@haojinming haojinming commented Sep 16, 2022

Signed-off-by: haojinming [email protected]

What problem does this PR solve?

Issue Number: close #227

Problem Description:

What is changed and how does it work?

Update BR readme (rendered)

Code changes

  • No code

Check List for Tests

This PR has been tested by at least one of the following methods:

  • No code

Side effects

  • No side effects

Related changes

  • No related changes

Signed-off-by: haojinming <[email protected]>
Signed-off-by: haojinming <[email protected]>
br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated
# Attach to control container to run BR
docker exec -it br_control_1 bash
# Restore from the backup.
bin/br restore raw \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bin/br restore raw \
bin/tikv-br restore raw \

br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated
- It is recommended that you execute multiple backup operations serially. Running different backup operations in parallel reduces backup performance and also affects the online application.
- It is recommended that you execute multiple restore operations serially. Running different restore operations in parallel increases Region conflicts and also reduces restore performance.
- `TiKV-BR` supports checksum between `TiKV` cluster and backup files after backup or restore with the config `--checksum=true`. Note that, if checksum is enabled, please make sure no data is changed or `TTL` expired in `TiKV` cluster during backup or restore.
- TiKV-BR supports [`api-version`](https://docs.pingcap.com/zh/tidb/stable/tikv-configuration-file#api-version-%E4%BB%8E-v610-%E7%89%88%E6%9C%AC%E5%BC%80%E5%A7%8B%E5%BC%95%E5%85%A5) conversion from V1 to V2 with config `--dst-api-version V2`. Then restore the backup files to APIV2 `TiKV` cluster
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to also talk about that the conversion is mainly used for upgrade TiKV storage from v1 to v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, PTAL~

br/README.md Show resolved Hide resolved
br/README.md Outdated

# How many row do we get? 300242 rows.
mysql --host 127.0.0.1 --port 4000 -E -e "SELECT COUNT(*) FROM test.order_line" -u root -p
TiKV-BR can do checksum between TiKV cluster and backup files after backup finish with the config `--checksum=true`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to introduce how to verify backup & restore in a separated section. I think users will much care about the correctness.
Can also talk about the checksum function of client-go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one section "Data Verification of Backup&Restore". PTAL~

br/README.md Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 19, 2022

Codecov Report

Merging #228 (175dda3) into main (57dc032) will increase coverage by 0.1863%.
The diff coverage is 100.0000%.

Additional details and impacted files

Impacted file tree graph

@@               Coverage Diff                @@
##               main       #228        +/-   ##
================================================
+ Coverage   61.0002%   61.1866%   +0.1863%     
================================================
  Files           238        238                
  Lines         20195      20158        -37     
================================================
+ Hits          12319      12334        +15     
+ Misses         6754       6685        -69     
- Partials       1122       1139        +17     
Flag Coverage Δ *Carryforward flag
br 59.8324% <100.0000%> (-0.0765%) ⬇️
cdc 61.8059% <ø> (+0.3036%) ⬆️ Carriedforward from 00ad8da

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
br/pkg/task/backup_raw.go 67.7852% <100.0000%> (ø)
br/cmd/br/backup.go 60.0000% <0.0000%> (-11.4286%) ⬇️
cdc/cdc/sink/buffer_sink.go 80.0000% <0.0000%> (-1.8182%) ⬇️
cdc/cdc/sorter/unified/merger.go 68.2847% <0.0000%> (-1.2945%) ⬇️
cdc/cdc/kv/client.go 86.0230% <0.0000%> (+0.2881%) ⬆️
cdc/cdc/sorter/unified/file_backend.go 53.9473% <0.0000%> (+20.1754%) ⬆️

br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated
--log-file "/logs/br_restore.log"
TiKV-BR, TiKV nodes, and the backup storage system should provide network bandwidth that is greater than the backup speed. If the target cluster is particularly large, the threshold of backup and restoration speed is limited by the bandwidth of the backup network.
The backup storage system should also provide sufficient write/read performance (IOPS). Otherwise, the IOPS might become a performance bottleneck during backup or restoration.
TiKV nodes need to have at least two additional CPU cores and high performance disks for backups. Otherwise, the backup might have an impact on the services running on the cluster.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest: "TiKV nodes need to have at least two additional spared CPU cores and disk bandwidth (related to ratelimit parameter) for backups.

And it would be better to have a section talking about "impact of backup & restore" (maybe in another PR). I think this is another topic that user would concern much.

We can refer to https://docs.pingcap.com/tidb/stable/backup-and-restore-faq#how-much-impact-does-a-backup-operation-have-on-the-cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will add one more section to talk about the performance impact in another PR.

br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated Show resolved Hide resolved
@pingyu pingyu requested a review from zeminzhou September 19, 2022 12:29
Copy link
Contributor

@zeminzhou zeminzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

br/README.md Outdated Show resolved Hide resolved
br/README.md Outdated
```

## Compatibility test
### Data Verification of Backup&Restore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Data Verification of Backup&Restore
### Data Verification of Backup & Restore

--pd "${PDIP}:2379" \
--storage "s3://backup-data/2022-09-16/" \
--log-file backupfull.log
Backup Raw <---------/................................................> 17.12%.
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to also introduce the result message of backup later.
(Especially the usage of BackupTs. And the kv size was confused to users in some past cases.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, PTAL~, thanks a lot.

--storage "s3://backup-data/2022-09-16/" \
--ratelimit 128 \
--log-file restoreraw.log
Restore Raw <---------/...............................................> 17.12%.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to also introduce the result message of restore later.

br/README.md Show resolved Hide resolved
br/README.md Outdated
[2022/09/20 18:01:10.125 +08:00] [INFO] [collector.go:67] ["Raw backup success summary"] [total-ranges=3] [ranges-succeed=3] [ranges-failed=0] [backup-total-regions=3] [total-take=5.050265883s] [BackupTS=436120585518448641] [total-kv=100000] [total-kv-size=108.7MB] [average-speed=21.11MB/s] [backup-data-size(after-compressed)=78.3MB]
```
Explanations for the above message are as follows:
- `total-ranges`: specifies the ranges' count that backup separate the task. Equals to `ranges-succeed` + `ranges-failed`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specifies: suggest to be indicates or just remove. The same to followings.

Suggested change
- `total-ranges`: specifies the ranges' count that backup separate the task. Equals to `ranges-succeed` + `ranges-failed`.
- `total-ranges`: the number of ranges that the whole backup task is splitted to. Equals to `ranges-succeed` + `ranges-failed`.

br/README.md Outdated
```
Explanations for the above message are as follows:
- `total-ranges`: specifies the ranges' count that backup separate the task. Equals to `ranges-succeed` + `ranges-failed`.
- `ranges-succeed`: specifies the succeeded task count.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `ranges-succeed`: specifies the succeeded task count.
- `ranges-succeed`: the number of succeeded ranges.

br/README.md Outdated
Comment on lines 128 to 129
- `ranges-succeed`: specifies the succeeded task count.
- `ranges-failed`: specifies the failed task count.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

br/README.md Outdated
- `ranges-failed`: specifies the failed task count.
- `backup-total-regions`: specifies the tikv regions that backup takes.
- `total-take`: specifies the backup duration.
- `BackupTS`: specifies the backup start timestamp, only take effect for APIV2 TiKV cluster, which can be used as TiKV-CDC [start-ts](https://github.com/tikv/migration/blob/main/cdc/README.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last timestamp ?
"which can be used as start-ts of TiKV-CDC when creating replication tasks. Refer to Create a replication task.

br/README.md Outdated
- `total-take`: specifies the backup duration.
- `BackupTS`: specifies the backup start timestamp, only take effect for APIV2 TiKV cluster, which can be used as TiKV-CDC [start-ts](https://github.com/tikv/migration/blob/main/cdc/README.md).
- `total-kv`: specifies the total kv count in backup files.
- `total-kv-size`: specifies the total kv size in backup files.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `total-kv-size`: specifies the total kv size in backup files.
- `total-kv-size`: specifies the total kv size in backup files. Note that this is the original size before compression.

br/README.md Outdated
[2022/09/20 18:02:12.540 +08:00] [INFO] [collector.go:67] ["Raw restore success summary"] [total-ranges=3] [ranges-succeed=3] [ranges-failed=0] [restore-files=3] [total-take=950.460846ms] [restore-data-size(after-compressed)=78.3MB] [total-kv=100000] [total-kv-size=108.7MB] [average-speed=114.4MB/s]
```
Explanations for the above message are as follows:
- `total-ranges`: specifies the ranges' count that restoration separate the task. Equals to `ranges-succeed` + `ranges-failed`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

br/README.md Outdated
# Create a bucket to save backup by mc (a MinIO Client).
mc config host add minio $S3_ENDPOINT $MINIO_ACCESS_KEY $MINIO_SECRET_KEY && \
mc mb minio/mybucket
*Note: please specify the [api-version](https://docs.pingcap.com/tidb/stable/tikv-configuration-file#api-version-new-in-v610) config of TiKV cluster. TiKV-BR supports APIV1 & APIV2 backup/restore and conversion from APIV1 to APIV2. The detail usage is as following.*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the version requirement of TiKV.

Copy link
Collaborator

@pingyu pingyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

@pingyu pingyu merged commit 11c9aef into tikv:main Sep 21, 2022
@haojinming haojinming deleted the br_readme branch September 21, 2022 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update TiKV-BR readme.md doc
3 participants