-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lightning: add store write limiter #35193
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
7d9740c
to
23baa4c
Compare
Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/6797a6d27e5894fee0cf4334405f9a090cb307df |
23baa4c
to
31c89e5
Compare
flushKVs := func() error { | ||
for i := range clients { | ||
if local.writeLimiter != nil { | ||
if err := local.writeLimiter.WaitN(ctx, storeIDs[i], int(size)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the rate limiter controls the writing limit for each TiKV storage nodes. Since the burst is just 20% more of the limit, if the sending size is larger than the and the limit / burst size, I'm afraid the actual write rate is smaller than the setting value ?
For example, suppose rate limiter is 1000, burst is 1200. If the sending size is, say 3000, and there are 3 nodes. For the first storage's rate limiter, it takes around 3 seconds to prepare the 3000 B storage data, and the rate is 1000B/s , which is OK. However, when iterating on the second storage node, it takes another 3 seconds to prepare the second storage data, and at that time the data has not sent actually, and so does the third storage node. At that time where preparation finishes, 3+3+3 = 9 seconds has elapsed. But for each node, only 3000 bytes are sent. So the actual writing rate for each node is the sending size / total prepare time, which is 3000 / 9.
Maybe those several rate limiter could be waiting in parallel ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your analysis is correct. There is a flushLimit
, which is not greater than the limit. So the WaitN
size is usually less than the burst.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my test, the size
is about 3M. In practice, there is no need to set such a small limit. Because to achieve this speed, we can use tidb-backend instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At that time where preparation finishes, 3+3+3 = 9 seconds has elapsed. But for each node, only 3000 bytes are sent. So the actual writing rate for each node is the sending size / total prepare time, which is 3000 / 9.
I thought of it again. I think this is as expected. Because kvs are sent to tikv node one by one for each region. The total send size is 9000, we need 9 seconds to send these data. It's unnecessary to wait for the rate limiter in parallel unless we actually send data to tikv in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
burst = limit + limit/5 | ||
} else { | ||
// If overflowed, set burst to math.MaxInt. | ||
burst = math.MaxInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe adjust them during config.adjust, and add a log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an internal implementation. burst
is not configurable.
// the speed of write more smooth. | ||
flushLimit := int64(math.MaxInt64) | ||
if local.writeLimiter != nil { | ||
flushLimit = int64(local.writeLimiter.limit / 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use 1/10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply set a value that is smaller than the limit value. The call interval of the limiter will not exceed 100ms.
@@ -532,6 +532,7 @@ type TikvImporter struct { | |||
|
|||
EngineMemCacheSize ByteSize `toml:"engine-mem-cache-size" json:"engine-mem-cache-size"` | |||
LocalWriterMemCacheSize ByteSize `toml:"local-writer-mem-cache-size" json:"local-writer-mem-cache-size"` | |||
StoreWriteBWLimit ByteSize `toml:"store-write-bwlimit" json:"store-write-bwlimit"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's bw
? batch write?
maybe max-store-write-rate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunzhaoyang ptal too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bw
means bandwitch
. It is similar to rysnc
--bwlimit
.
/run-integration-br-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@dsdashun: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/run-integration-br-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 8ec642b
|
/run-unit-test |
/run-mysql-test |
TiDB MergeCI notify✅ Well Done! New fixed [1] after this pr merged.
|
* upstream/master: sessionctx: support encoding and decoding session variables (pingcap#35531) planner: add batch_point_get access object (pingcap#35230) sessionctx: set skipInit false for TiDBOptProjectionPushDown and TiDBOptAggPushDown (pingcap#35491) *: add support for disabling noop variables (pingcap#35496) lightning: add store write limiter (pingcap#35193) expression: avoid padding 0 when implicitly cast to binary (pingcap#35053) types: fix creating partition tables fail in ANSI_QUOTES mode (pingcap#35379) executor: add the missed runtime stats when the index merge partial task returns 0 row (pingcap#35297) statistics: batch insert topn and bucket when saving table stats (pingcap#35326) parser: Add support for INTERVAL expr unit + expr (pingcap#30253) (pingcap#35390) config: add missing nodelay example (pingcap#35255) *: Introduce `OptimisticTxnContextProvider` for optimistic txn (pingcap#35131) statistics: fix panic when using wrong globalStats.Indices key (pingcap#35424) *: fix store token is up to the limit in test (pingcap#35374) *: enable more flaky and update bazel config (pingcap#35500) ddl: expose getPrimaryKey() as a public method of model.TableInfo (pingcap#35512) expression, util: add `KeyWithoutTrimRightSpace` for collator (pingcap#35475) infoschema: try on each PD member until one succeeds or all fail (pingcap#35285)
What problem does this PR solve?
Issue Number: close #35192
Problem Summary:
We want to use lightning to import data to an online cluster. To reduce the impact on online services, lightning needs to limit the write rate to the cluster.
What is changed and how it works?
Add a new config
tikv-importer.store-write-bwlimit
and limit the write bytes to TiKV nodes.Check List
Tests
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.