Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance metrics for KVStore write #8472

Merged
merged 32 commits into from
Dec 21, 2023

Conversation

CalvinNeo
Copy link
Member

@CalvinNeo CalvinNeo commented Dec 6, 2023

What problem does this PR solve?

Issue Number: close #8471

Problem Summary:

  1. Add tiflash_raft_throughput_bytes, type_write for writes from raftstore, type_write_committed for writes to delta merge
    (Add a timeseries panel in Raft)
  2. Add tiflash_raft_raft_frequent_events_count.type_write_commit for every write event to to delta merge
  3. Add tiflash_raft_write_flow_bytes, type_ingest_uncommitted for uncommitted writes bytes after ingestsst, type_write_committed for committed write bytes, type_net_write for net write bytes(region's size after - region's size before)
    (Add two histogram panels in Raft)

The KVStore memory tracker test

tiup bench ycsb load tidb -p tidb.instances="127.0.0.1:4000" -p recordcount=10000
tiup bench ycsb run tidb -p tidb.instances="10.2.12.81:5711" -p operationcount=10000

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 6, 2023
@CalvinNeo

This comment was marked as outdated.

@CalvinNeo CalvinNeo changed the title Enhance metrics for KVStore write WIP Enhance metrics for KVStore write Dec 7, 2023
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 7, 2023
@CalvinNeo CalvinNeo requested a review from flowbehappy December 7, 2023 04:01
@JinheLin
Copy link
Contributor

JinheLin commented Dec 7, 2023

I think we need to add a metric about how many region data have not been committed? Since these data are all resident in memory.

@JaySon-Huang
Copy link
Contributor

Can you commit the changes to tiflash_summary.json ? Or provide a preview URL of the changed grafana panels

@CalvinNeo
Copy link
Member Author

CalvinNeo commented Dec 7, 2023

I think we need to add a metric about how many region data have not been committed? Since these data are all resident in memory.

There is already metric about this which is called tiflash_raft_region_flush_bytes.type_unflushed and is updated every unaccepted CompactLog to tell how many data are in the Region without flush.

However, there could be some other metrics that could improve this metric, which is more proper and useful?

  1. A metric like type_unflushed but updated more frequently, maybe on every write/commit(tx), so that when a big tx is coming, we can read from this metric that there are many uncommitted key-values stuck in this Region. However, this metric seems weird to me, because we can't tell the difference between a single frequently reported region and many infrequently reported regions. They will all paint half of the histogram red above a certain horizontal line, however, this is misleading. We need views of memories take from all region at a certain timestamp to build a histogram, or we could have a sum-up line chart.
  2. A metric that provides a sum-up line char, which means how many memories(roughly, maybe only count RegionData) do all regions in KVStore take. I think this is important, as Avoid exit for ProxyFFI when alloc failure due to memory exceeds limit #8464 has also pointed out that we need a mechanism to monitor the memory taken of all storage components. If we do so, I think we need a design before.

As a conclusion, I think we should have a metric about how many memories KVStore takes, but I think it could be handled in PR for #8464. In this PR, we are trying to provide a more convenient way to diagnose the problem of big transaction, especially "invisible" uncommitted big transactions that happens between CompactLog and CompactLog. I introduced "Net Write In" for this, so the diagnose could be:

  1. Check memory metrics, and find OOM
  2. [Strong evidence] Check "Raft Events QPS" and see there is a low rate of write_commit
  3. Check "Raft Log Gap" and see the panel is spread with data points, which means there has been many logs after the last CompactLog(heavy write)
  4. [Strong evidence] Check "Raft Entry Batch Size Heatmap" and see large size of every entry batch
  5. [Strong evidence] Check "Raft handled bytes" and see large gap between "write_committed" and "write". Note that write - write_committed is not the size of uncommitted data we want, since an Rollback put will be created when transaction rollbacks. It will be commit with no effect.

@CalvinNeo CalvinNeo changed the title WIP Enhance metrics for KVStore write Enhance metrics for KVStore write Dec 7, 2023
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 7, 2023
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 7, 2023
@JinheLin
Copy link
Contributor

tiflash_raft_region_flush_bytes.type_unflushed seems cannot tell the memory usage of region data. Because the flushed uncommited data can still use a lot of memory.

@CalvinNeo
Copy link
Member Author

CalvinNeo commented Dec 11, 2023

tiflash_raft_region_flush_bytes.type_unflushed seems cannot tell the memory usage of region data. Because the flushed uncommited data can still use a lot of memory.

I think the name is ambiguous and misleading here. What I actually mean is how much memory does this region take when a "unflushing" event takes place, whereas the type_flushed is for how much memory this region takes when a "flushing" event takes place.

However, I think I can't come up with a name that is short enough and can convey what I mean... Maybe it could be type_unflushed_compact_log, but there are too many things related to compact logs which is a concept from TiKV rather than TiFlash and can also make one confusing IMO

@JinheLin
Copy link
Contributor

I know what you means.

The problems of tiflash_raft_region_flush_bytes are:

  1. Its reporting frequency is triggered by the compact log command, not particularly timely.
  2. It it about memory usage of each region, not all.

I think we need a MemoryTracker object to track the memory usage of all regions' data.

@CalvinNeo
Copy link
Member Author

CalvinNeo commented Dec 12, 2023

I know what you means.

The problems of tiflash_raft_region_flush_bytes are:

  1. Its reporting frequency is triggered by the compact log command, not particularly timely.
  2. It it about memory usage of each region, not all.

I think we need a MemoryTracker object to track the memory usage of all regions' data.

The problem is that it could not give a "global view" of memory consumption of all regions, so we can only record the metrics based on some events like CompactLog.

In TiKV, there is a module named MemoryTrace in https://github.com/tikv/tikv/blob/master/components/tikv_alloc/src/trace.rs. I think may be we can have something like this to observe all memory consuming objects.

I suggest we have a design on this.

@JinheLin
Copy link
Contributor

TiFlash has class MemoryTracker for this purpose.

@CalvinNeo
Copy link
Member Author

MemoryTracker

Good idea, I will look into that

Signed-off-by: CalvinNeo <[email protected]>
@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Dec 20, 2023

Also push the grafana changes to this PR?

@CalvinNeo
Copy link
Member Author

Also push the grafana changes to this PR?

This pr is held until the grafana is pushed.

If other things are OK, could you firstly lgtm this PR? Lots of work taken to build such a metric with about 4+ new panels.

@CalvinNeo
Copy link
Member Author

/run-all-tests

Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 20, 2023
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
@CalvinNeo
Copy link
Member Author

image image image

@CalvinNeo
Copy link
Member Author

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 20, 2023
@CalvinNeo
Copy link
Member Author

/run-all-tests

Copy link
Contributor

@JaySon-Huang JaySon-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 21, 2023
Copy link
Contributor

ti-chi-bot bot commented Dec 21, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JaySon-Huang, JinheLin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [JaySon-Huang,JinheLin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Dec 21, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-12-19 08:56:13.203748166 +0000 UTC m=+951264.240975077: ☑️ agreed by JinheLin.
  • 2023-12-21 03:35:02.185310355 +0000 UTC m=+1104793.222537304: ☑️ agreed by JaySon-Huang.

Copy link
Contributor

ti-chi-bot bot commented Dec 21, 2023

@CalvinNeo: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

trigger some heavy tests which will not run always when PR updated.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot bot merged commit cc76e87 into pingcap:master Dec 21, 2023
6 checks passed
CalvinNeo added a commit to CalvinNeo/tiflash that referenced this pull request Dec 21, 2023
yibin87 pushed a commit to yibin87/tiflash that referenced this pull request Dec 22, 2023
JaySon-Huang pushed a commit to JaySon-Huang/tiflash that referenced this pull request Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance metrics for write from KVStore
3 participants