-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add design document about Lock View #24375
Merged
Merged
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
46badd9
Add design document about Lock View
MyonKeminta d8c7aa6
Add description about permission restrictions
MyonKeminta 3eefc2b
Merge branch 'master' into m/lock-view-design-doc
ti-chi-bot 7051a45
Merge branch 'master' into m/lock-view-design-doc
ti-chi-bot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,191 @@ | ||
# TiDB Design Documents | ||
|
||
- Author(s): [longfangsong](https://github.com/longfangsong), [MyonKeminta](http://github.com/MyonKeminta) | ||
- Last updated: April 29, 2021 | ||
- Discussion PR: N/A | ||
- Tracking Issue: https://github.com/pingcap/tidb/issues/24199 | ||
|
||
## Table of Contents | ||
|
||
* [Introduction](#introduction) | ||
* [Motivation or Background](#motivation-or-background) | ||
* [Detailed Design](#detailed-design) | ||
* [Test Design](#test-design) | ||
* [Functional Tests](#functional-tests) | ||
* [Scenario Tests](#scenario-tests) | ||
* [Compatibility Tests](#compatibility-tests) | ||
* [Benchmark Tests](#benchmark-tests) | ||
* [Impacts & Risks](#impacts--risks) | ||
* [Investigation & Alternatives](#investigation--alternatives) | ||
* [Unresolved Questions](#unresolved-questions) | ||
|
||
## Introduction | ||
|
||
This document describes the design of the feature Lock View, which provides tools to analyze problems about transaction's lock waiting, lock contentions and deadlocks. | ||
|
||
## Motivation or Background | ||
|
||
Currently, it's very hard to analyze lock contentions and deadlocks for transactions. One may need to enable general log, try to reproduce the problem, and try to analyze the log to find the cause, which is very difficult and inconvenient. We also found that this way of analyzing is not feasible in some scenarios. It's highly required to provide some better approach to analyze these kinds of problems. | ||
|
||
## Detailed Design | ||
|
||
Several tables will be provided in `information_schema`. Some tables has both local version (fetches data on the current TiDB node) and global version (fetches data among the whole cluster), and the global version's table name has the `"CLUSTER_"` prefix. | ||
|
||
### Table `(CLUSTER_)TIDB_TRX` | ||
|
||
| Field | Type | Comment | | ||
|------------|------------|---------| | ||
|`TRX_ID` | `unsigned bigint` | The transaction ID (aka. start ts) | | ||
|`TRX_STARTED`|`time`| Human readable start time of the transaction | | ||
|`DIGEST`|`text`| The digest of the current executing SQL statement | | ||
|`SQLS` | `text` | A list of all executed SQL statements' digests | | ||
|`STATE`| `enum('Running', 'Lock waiting', 'Committing', 'RollingBack')`| The state of the transaction | | ||
| `WAITING_START_TIME` | `time` | The elapsed time since the start of the current lock waiting (if any) | | ||
| `SCOPE` | `enum('Global', 'Local')` | The scope of the transaction | | ||
| `ISOLATION_LEVEL` | `enum('RR', 'RC')` | | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it's better to use |
||
| `AUTOCOMMIT` | `bool` | | | ||
| `SESSION_ID` | `unsigned bigint` | | | ||
| `USER` | `varchar` | | | ||
| `DB` | `varchar` | | | ||
| `SET_COUNT` | `int` | Modified keys of the current transaction | | ||
| `LOCKED_COUNT` | `int` | Locked keys of the current transaction | | ||
| `MEM_BUFFER_SIZE` | `int` | Size occupied by the transaction's membuffer | | ||
|
||
* Life span of rows: | ||
* Create on first writing or locking operation in a transaction | ||
* Remove after the transaction is done | ||
* Collecting, storing and querying: | ||
* All these information can be collected on TiDB side. Since the amount of concurrent transactions won't be too large, and it doesn't need to be persisted, so it's ok to implement it as a memory table. For querying among the cluster, just register the table under `infoschema/cluster.go` and write the global table name with the local one. | ||
* As the simplest way of implementing, most information can be passed with a similar way like `ProcessInfo`, or even directly passed via the `ProcessInfo` struct. | ||
|
||
### Table `DATA_LOCK_WAITS` | ||
|
||
| Field | Type | Comment | | ||
|------------|------------|---------| | ||
| `HASH` | `bigint` | The hash of the lock in TiKV's LockManager | | ||
| `KEY` | `varchar` | The key that's being waiting on | | ||
| `TRX_ID` | `unsigned bigint` | The current transaction that's waiting for the lock | | ||
| `SQL_DIGEST` | `text` | The digest of the SQL that's trying to acquire the lock | | ||
| `CURRENT_HOLDING_TRX_ID` | `unsigned bigint` | The transaction that's holding the lock and blocks the current transaction | | ||
|
||
* Life span of rows: | ||
* Created on a lock come into LockManager | ||
* Removed after a lock leave LockManager | ||
* Collecting, storing and querying: | ||
* All these will be collected on TiKV LockManager, and will need a new RPC entry for TiDB to query. LockManager won't store the un-hashed key or SQL_DIGEST for now, so we need to modify it. | ||
* The SQL Digest of the transaction that's currently holding the lock may be helpful, but it's hard to implement under the current architecture. So it won't be included in the first version of the feature. | ||
|
||
### Table `(CLUSTER_)DEAD_LOCK` | ||
|
||
| Field | Type | Comment | | ||
|------------|------------|---------| | ||
| `DEADLOCK_ID` | `int` | There needs multiple rows to represent information of a single deadlock event. This field is used to distinguish different events. | | ||
| `OCCUR_TIME` | `time` | The physical time when the deadlock occurs | | ||
| `TRY_LOCK_TRX_ID` | `unsigned bigint` | The transaction ID (start ts) of the transaction that's trying to acquire the lock | | ||
| `CURRENT_SQL_DIGEST` | `text` | The SQL that's being blocked | | ||
| `KEY` | `varchar` | The key that's being locked, but locked by another transaction in the deadlock event | | ||
| `SQLS` | `text` | A list of the digest of SQL statements that the transaction has executed | | ||
| `TRX_HOLDING_LOCK` | `unsigned bigint` | The transaction that's currently holding the lock. There will be another record in the table with the same `DEADLOCK_ID` for that transaction. | | ||
|
||
* Life span of rows: | ||
* Create after TiDB receive a deadlock error | ||
* FIFO,clean the oldest after buffer is full | ||
* Collecting, storing and querying: | ||
* All of these information can be collected on TiDB side. It just need to add the information to the table when receives deadlock error from TiKV. The information of other transactions involved in the deadlock circle needed to be fetched from elsewhere (the `TIDB_TRX` table) when handling the deadlock error. | ||
* Currently there are no much information in the deadlock error (it doesn't has the SQLs and keys' information), which needs to be improved. | ||
|
||
|
||
### Protocol | ||
|
||
To pass necessary information between TiDB and TiKV to make this feature possible, there needs some additional information carried in the protocol defined in kvproto. | ||
|
||
deadlockpb: | ||
|
||
```diff | ||
message WaitForEntry { | ||
... | ||
+ bytes key = ...; | ||
+ bytes resource_group_tag = ...; | ||
} | ||
|
||
message DeadlockResponse { | ||
... | ||
+ repeated WaitiForEntry wait_chain = ...; | ||
} | ||
``` | ||
|
||
kvrpcpb: | ||
|
||
```diff | ||
message Context { | ||
... | ||
+ bytes resource_group_tag = ...; | ||
} | ||
|
||
message Deadlock { | ||
... | ||
+ repeated deadlock.WaitForEntry wait_chain = ...; | ||
} | ||
|
||
+ message GetLockWaitInfoRequest { | ||
+ Context context = 1; | ||
+ } | ||
+ | ||
+ message GetLockWaitInfoResponse { | ||
+ errorpb.Error region_error = 1; | ||
+ string error = 2; | ||
+ repeated deadlock.WaitForEntry entries = 3; | ||
+ } | ||
``` | ||
|
||
A field `resource_group_tag` will be added to `Context`. The SQL digest (and maybe more information) will be serialized and carried in this field. This field is expected to be reused by another feature named *Top SQL* which wants to carry SQL digest and plan to most transactional requests. | ||
|
||
A new KV RPC `GetLockWait` will be added to allow getting the lock waiting status from TiKV. This is a store-level (instead of region level) request, like `UnsafeDestroyRange`, and those Green GC related RPCs. The request can carry some filtering options to filter out those information the user don't care about. But the current memory table implementation only allow TiDB to scan the whole table and then filter it. This may need further optimization in the future. | ||
|
||
The locking key and `resource_group_tag` that comes from the `Context` of the pessimistic lock request is added to the deadlock detect request, and the wait chain is added to the deadlock detect response. | ||
|
||
The wait chain will be added to the `Deadlock` error which is returned by the `PessimisticLock` request, so that when deadlock happens, the full wait chain information can be passed to TiDB. | ||
|
||
## Compatibility | ||
|
||
This feature is not expected to be incompatible with other features. During upgrading, when there are different versions of TiDB nodes exists at the same time, it's possible that the `CLUSTER_` prefixed tables may encounter errors. But since this feature is typically used by user manually, this shouldn't be a severe problem. So we don't need to care much about that. | ||
|
||
## Test Design | ||
|
||
### Functional Tests | ||
|
||
* Querying the tables defined above gives correct result. | ||
|
||
### Scenario Tests | ||
|
||
* In a scenario where there's lock contention, this feature helps locating the problem. | ||
* In a scenario where some a SQL is blocked by another transaction, this feature helps locating the problem. | ||
* In a scenario where deadlock happens, this feature helps finding how the deadlock is formed. | ||
|
||
### Compatibility Tests | ||
|
||
- N/A | ||
|
||
### Benchmark Tests | ||
|
||
* The feature shouldn't cause any obvious performance regression (< 2%) on normal scenarios. | ||
* Accessing these tables shouldn't increase latency of concurrent normal queries. | ||
|
||
## Impacts & Risks | ||
|
||
* To be investigated | ||
|
||
## Investigation & Alternatives | ||
|
||
* MySQL provides `data_locks` and `data_lock_waits` tables. | ||
* Oracle provides `v$lock` view. | ||
* CRDB provides `crdb_internal.node_transaction_statistics` that shows rich information for transactions. | ||
|
||
## Unresolved Questions | ||
|
||
* Since lock waiting on TiKV may timeout and retry, it's possible that in a single query to `DATA_LOCK_WAIT` table doesn't shows all (logical) lock waiting. | ||
* Information about internal transactions may not be collected in our first version of implementation. | ||
* Since TiDB need to query transaction information after it receives the deadlock error, the transactions' status may be changed during that time. As a result the information in `(CLUSTER_)DEAD_LOCK` table can't be promised to be accurate and complete. | ||
* Statistics about transaction conflicts is still not enough. | ||
* Historical information of `TIDB_TRX` and `DATA_LOCK_WAITS` is not kept, which possibly makes it still difficult to investigate some kind of problems. | ||
* The SQL digest that's holding lock and blocking the current transaction is hard to retrieve and is not included in the current design. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
ALL_SQLS
,ALL_DIGESTS
, orHISTORY_DIGESTS
looks better.