-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
glossary: add some terms to glossary #14298
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ran <[email protected]>
It seems that not all terms in the Chinese version pingcap/docs-cn#11227 are covered in this PR. Would you add the other terms to this PR later? |
@lilin90 sure |
/rebase |
glossary.md
Outdated
|
||
[TiCDC](/ticdc/ticdc-overview.md) is a tool for incrementally replicating data in TiDB. It pulls the data change logs from the upstream TiKV and parses them into ordered row-level change data, which it then outputs to the downstream. For more information about the concepts and terms of TiCDC, see [TiCDC Glossary](/ticdc/ticdc-glossary.md). | ||
|
||
### TiDB Data Migration (DM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people might be looking for DM under "D" instead of under "T".
### TiDB Data Migration (DM) | |
### Data Migration (DM) |
Co-authored-by: Daniël van Eeden <[email protected]>
Co-authored-by: Daniël van Eeden <[email protected]>
Co-authored-by: Daniël van Eeden <[email protected]>
Co-authored-by: Daniël van Eeden <[email protected]>
@ran-huang: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
glossary.md
Outdated
@@ -42,18 +42,22 @@ Baseline Capturing captures queries that meet capturing conditions and create bi | |||
|
|||
### Batch Create Table | |||
|
|||
Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled by default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md). | |||
Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled by default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For more information, see [Batch Create Table](/br/br-batch-create-table.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the use of a version number to introduce a feature doesn't add much value. We should be more upfront about defining the feature itself. Looking at uses of version numbers it is inconsistently used so I would suggest removing it.
My suggestion for this entry would be something like:
The Batch Create Table feature greatly speeds up the creation of multiple tables at a time by creating tables in batches. For example, when restoring 1,000's of tables using the BR (Backup & Restore) tool this can help shorten the overall recovery time. For more information, see Batch Create Table.
glossary.md
Outdated
|
||
## C | ||
|
||
### Cached Table | ||
|
||
With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance. | ||
|
||
### Cluster | ||
|
||
A cluster is a group of nodes that work together to provide services. It typically consists of different types of nodes. For example, a TiDB cluster usually consists of TiDB nodes, TiKV nodes, and PD nodes, and a DM cluster usually consists of DM Master nodes and DM Worker nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a generic sense of the term cluster, the phrase "It typically consists of different types of nodes", is confusing. TiDB actually consists of multiple layers of clusters. The TiDB nodes, TiKV nodes, and PD nodes each belong to their own cluster, see the first diagram in https://docs.pingcap.com/tidb/stable/tidb-architecture.
I think it would be worthwhile expanding a bit to talk about the scalability and availability benefits of clustering instead.
My suggestion for this entry would be:
A cluster is a group of nodes that work together to provide services. For example, a cluster of TiKV nodes provides the storage services for TiDB. Using clusters of nodes, in a distributed system, can deliver higher availability and greater scalability than a single node can provide. As a distributed system, TiDB uses clusters of nodes to deliver highly available and scalable services: a cluster of TiDB Servers provides a scalable SQL layer to clients; a cluster of PD nodes provides a resilient metadata layer for TiDB; and a cluster of TiKV servers, using the raft-consensus protocol, provides a highly available, scalable, and resilient storage service for TiDB. See the TiDB Architecture doc for more information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure if using ‘a cluster of TiKV’ and ‘a cluster of PD nodes’ might confuse users, as we typically refer to a TiDB cluster as consisting of TiDB nodes, TiKV nodes, and PD nodes. To avoid confusion, we could simply mention them as TiDB nodes, TiKV nodes, and PD nodes here.
glossary.md
Outdated
|
||
### Continuous Profiling | ||
|
||
Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). | ||
Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For more information, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the "Introduced in TiDB 5.3.0," at the start of the sentence. That version is now well out of it's maintenance support (to 2023-11-30) and extended support (2024-11-30) periods as per https://www.pingcap.com/tidb-release-support-policy/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also too many "ands" in the second sentence, I would suggest rewriting as:
With the support of Continuous Profiling, TiDB provides fine-grained insights into performance problems helping operations teams locate the root cause of performance problems using a flame graph.
glossary.md
Outdated
|
||
PD Control (pd-ctl) is a command-line tool to interface with the placement driver (PD) of the cluster. You can use it to obtain cluster status information and modify the cluster. For more information, see [PD Control User Guide](/pd-control.md). | ||
|
||
### pending/down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be Capitalized? "Pending/Down" as in the previous version? Is there a reason they were moved to lower case?
glossary.md
Outdated
|
||
"Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. | ||
|
||
### Placement Driver (PD) | ||
|
||
Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) responsible for storing metadata, assigning [Timestamp Oracle (TSO)](/tso.md) for transaction timestamps, orchestrating data placement on TiKV, and running [TiDB Dashboard](/dashboard/dashboard-overview.md). For more information, see [TiDB Scheduling](/tidb-scheduling.md). | ||
|
||
### Placement Rules | ||
|
||
Placement rules are used to configure the placement of data in a TiKV cluster through the SQL interface. With this feature, you can specify the deployment of tables and partitions to different regions, data centers, cabinets, and hosts. Use cases include optimizing data availability strategies at low cost, ensuring that local data replicas are available for local stale reads, and complying with local data compliance requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placement rules are not just configured using the SQL interface (though that is preferred), they can also be configured pd-ctl, see https://docs.pingcap.com/tidb/stable/configure-placement-rules#set-rules-using-pd-ctl. The link guiding uses to the SQL interface is probably sufficient.
My suggestion for this paragraph is:
Placement rules are used to configure the placement of data in a TiKV cluster. With this feature, you can specify the deployment of tables and partitions to different regions, data centers, cabinets, and hosts. Use cases include optimizing data availability strategies at low cost, ensuring that local data replicas are available for local stale reads, and complying with local data compliance requirements.
glossary.md
Outdated
@@ -286,13 +351,37 @@ A store refers to the storage node in the TiKV cluster (an instance of `tikv-ser | |||
|
|||
## T | |||
|
|||
### Temporary table | |||
|
|||
Temporary tables solve the issue of temporarily storing the intermediate results of an application, which frees you from frequently creating and dropping tables. You can store the intermediate calculation data in temporary tables. When the intermediate data is no longer needed, TiDB automatically cleans up and recycles the temporary tables. This avoids user applications being too complicated, reduces table management overhead, and improves performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should that start with: "Temporary tables solve the issue of temporarily storing the intermediate results of an application's calculations,"?
glossary.md
Outdated
|
||
### TiCDC | ||
|
||
[TiCDC](/ticdc/ticdc-overview.md) is a tool for incrementally replicating data in TiDB. It pulls the data change logs from the upstream TiKV and parses them into ordered row-level change data, and then outputs the data to the downstream. For more information about the concepts and terms of TiCDC, see [TiCDC Glossary](/ticdc/ticdc-glossary.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest expanding this a bit, something like:
TiCDC is a tool for incrementally replicating data from TiDB to other downstream targets. These downstream targets may include other TiDB instances, MySQL compatible databases, object storage locations, and streaming processors (like Kafka and Pulsar). TiCDC pulls the data change logs from the upstream TiKV, parses them into ordered row-level change data, and then outputs the data to the downstream. For more information about the concepts and terms of TiCDC, see TiCDC Glossary.
@benmeadowcroft: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
glossary.md
Outdated
|
||
## C | ||
|
||
### Cached Table | ||
|
||
With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance. | ||
|
||
### Cluster | ||
|
||
A cluster is a group of nodes that work together to provide services. It typically consists of different types of nodes. For example, a TiDB cluster usually consists of TiDB nodes, TiKV nodes, and PD nodes, and a DM cluster usually consists of DM Master nodes and DM Worker nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure if using ‘a cluster of TiKV’ and ‘a cluster of PD nodes’ might confuse users, as we typically refer to a TiDB cluster as consisting of TiDB nodes, TiKV nodes, and PD nodes. To avoid confusion, we could simply mention them as TiDB nodes, TiKV nodes, and PD nodes here.
@benmeadowcroft: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What is changed, added or deleted? (Required)
glossary: add some terms to glossary
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions (in Chinese).
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?