-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add network-bytes-in and network-bytes-out metric support under CLUSTER SLOT-STATS command (#20) #720
Conversation
…alkey-io#20). The metric tracks network ingress bytes under per-slot context, by reverse calculation of c->argv_len_sum and c->argc, stored under a newly introduced field c->net_input_bytes_curr_cmd. Signed-off-by: Kyle Kim <[email protected]>
94ec40a
to
018d698
Compare
Technical design walk-throughThere exist two system requirements; 1) Collecting network ingress bytes, and 2) Grouping / aggregating the collected network ingress bytes under per-slot granularity. 1st iteration, using
|
tests/unit/cluster/slot-stats.tcl
Outdated
# ----------------------------------------------------------------------------- | ||
# Test cases for CLUSTER SLOT-STATS network-bytes-in. | ||
# ----------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Action items for integration tests
- Add additional test cases, namely;
- network-bytes-in for blocking commands.
- network-bytes-in for transactions (
MULTI/EXEC
). - network-bytes-in for non-slot specific commands (ex:
INFO
). - network-bytes-in for pipeline using valkey-cli (
valkey-cli --pipe
) - network-bytes-in for sharded pub/sub.
- Fix existing test cases to support the newly introduced
network-bytes-in
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this done, or is this a followup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
Fyi, "network-bytes-in for pipeline using valkey-cli" is not explicitly tested, as it follows the same code-path as multibulk processing, for which there already exists a test case. If an explicit test case for valkey-cli --pipe
is deemed necessary, I will follow-up in the next revision.
src/server.c
Outdated
@@ -3869,6 +3871,9 @@ int processCommand(client *c) { | |||
} | |||
} | |||
|
|||
/* Now that c->slot has been parsed, accumulate the buffered network bytes-in. */ | |||
clusterSlotStatsAddNetworkBytesIn(c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can add condition server.cluster_enabled here, if it is easier for us to understand the logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely could, although this would incur repetitive checks, as the same condition is checked once more down its callstack;
clusterSlotStatsAddNetworkBytesIn();
|
| calls
|
canAddNetworkBytes(); // --> checks for server.cluster_enabled
Personally, I prefer to keep the top-level call as simple as possible, and embed all conditional checks inside the function body. This way, the same function can be called in multiple locations, without having to wrap them with the same conditional checks everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me in general. Great comments.
Please include the API changes in the PR description. The command response, how it looks, by example or description.
src/cluster_slot_stats.c
Outdated
} slotStat; | ||
|
||
/* Struct used for storing slot statistics, for all slots owned by the current shard. */ | ||
struct slotStat cluster_slot_stats[CLUSTER_SLOTS]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global is 128KB even in standalone mode. Can we store it inside the cluster struct so it only consumes this memory in cluster mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The struct has been moved under clusterState
.
struct clusterState {
...
slotStat slot_stats[CLUSTER_SLOTS];
}
src/cluster_slot_stats.h
Outdated
void clusterSlotStatReset(int slot); | ||
void clusterSlotStatsReset(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These names are too similar. Can we call the second one clusterSlotStatResetAll
?
src/commands/cluster-slot-stats.json
Outdated
@@ -35,6 +35,9 @@ | |||
"properties": { | |||
"key-count": { | |||
"type": "integer" | |||
}, | |||
"memory-bytes-in": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
network, not memory
Haven't circled all the way back around to this, but @kyle-yh-kim can you also document the performance characteristics so we can make the decision if it should be enabled by default or not? |
Performance benchmarking summaryWith just
Appendix: Test setupServer setup
Traffic generator setup
|
I think you post the benchmark result in the wrong thread, maybe here #712 |
123182e
to
b0f1009
Compare
- Added config guard, "cluster-slot-stats-enabled", with default value true. - Added more test-cases. - Added network-bytes-in accumulation for replicated commands. - Added network-bytes-in accumulation for sharded pub/sub. - Added network-bytes-in accumulation for MULTI's RESP. - Moved slot_stats array under clusterState. - Fixed network-bytes-in accumulation for blocking commands. - Fixed c->argc accumulation by deferring its calculation. Signed-off-by: Kyle Kim <[email protected]>
b0f1009
to
07174ee
Compare
Currently, the PR aggregates
If this feels too confusing, there are two alternatives;
Personally, I would be opposed to dropping the replication stream and sharded pubsub requirements altogether, as the inclusion of all three components better represents the actual ingress bytes. Curious to hear the core team’s thoughts on this. |
Did you do this? Based on the performance of < ~1% CPU, it seems like maybe we should just leave this on without a config. We already have some optimizations on the cluster path for valkey 8, so I would expect it to be a net performance benefit to move to Valkey 8 and you also get the free observability metrics. |
Yes, benchmarking result for both To summarize, we can expect a TPS reduction of ~0.7%. As per your recommendation, good to enable it by default without a configuration. |
src/cluster_slot_stats.h
Outdated
void clusterSlotStatReset(int slot); | ||
void clusterSlotStatResetAll(void); | ||
void clusterSlotStatsAddNetworkBytesIn(client *c); | ||
void clusterSlotStatsAddNetworkBytesInForShardedPubSub(robj *channel, robj *message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this API also account network information based on client args ?
src/server.c
Outdated
|
||
/* Now that c->slot has been parsed, and command has been executed, | ||
* accumulate the buffered network bytes-in. */ | ||
clusterSlotStatsAddNetworkBytesIn(c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maybe move this under afterCommand
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would want to accumulate it on unblocking cases, so afterCommand
makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both clusterSlotStatsAddNetworkBytesInForUserClient()
and clusterSlotStatsAddNetworkBytesInForUserClient()
have been migrated under afterCommand()
.
For the case of MULTI
, since afterCommand()
is not reached upon queuing a command, the same call is made explicitly under queueMultiCommand()
to accumulate its network ingress bytes.
src/server.c
Outdated
|
||
/* Now that c->slot has been parsed, and command has been executed, | ||
* accumulate the buffered network bytes-in. */ | ||
clusterSlotStatsAddNetworkBytesIn(c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would want to accumulate it on unblocking cases, so afterCommand
makes sense.
tests/unit/cluster/slot-stats.tcl
Outdated
# ----------------------------------------------------------------------------- | ||
# Test cases for CLUSTER SLOT-STATS network-bytes-in. | ||
# ----------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this done, or is this a followup?
- Updated the aggregation to be stateful. First, cluster msg length is recorded under pubsubState. Once the slot is parsed, we then accumulate the previously captured length. This way, we bypass the redundant keyHashSlot() calls. Signed-off-by: Kyle Kim <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #720 +/- ##
==========================================
Coverage 70.37% 70.38%
==========================================
Files 112 112
Lines 61308 61458 +150
==========================================
+ Hits 43146 43257 +111
- Misses 18162 18201 +39
|
b90f537
to
5755699
Compare
5755699
to
8e09eb8
Compare
…alkey-io#20). The metric tracks network egress bytes under per-slot context, by hooking onto COB buffer mutations. The metric can be viewed by calling the CLUSTER SLOT-STATS command, with sample response attached below; ``` 127.0.0.1:6379> cluster slot-stats slotsrange 0 0 1) 1) (integer) 0 2) 1) "key-count" 2) (integer) 0 3) "cpu-usec" 4) (integer) 0 5) "network-bytes-in" 6) (integer) 0 7) "network-bytes-out" 8) (integer) 0 ``` Signed-off-by: Kyle Kim <[email protected]>
8e09eb8
to
df9619a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking at tests, but partial review.
set key_slot [R 0 cluster keyslot $key] | ||
set metrics_to_assert [list network-bytes-in] | ||
|
||
test "CLUSTER SLOT-STATS network-bytes-in, multi bulk buffer processing." { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have total net-in and out metrics here, https://valkey.io/commands/client-list/. We might validate this is the same. For these cases we expect them all to be the same.
# *3\r\n$5\r\nblpop\r\n$3\r\nkey\r\n$1\r\n0\r\n --> 31 bytes. | ||
$rd BLPOP $key 0 | ||
wait_for_blocked_clients_count 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can check the intermediary state here.
R 0 CONFIG RESETSTAT | ||
R 0 FLUSHALL | ||
|
||
test "CLUSTER SLOT-STATS network-bytes-out, for slot specific commands." { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like since we merged the PRs together, a lot of these tests are testing the same thing, one for in and one for out, can we not just merge each test together? seems like it would be a lot shorter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked into this a bit more.
Ultimately, network-bytes-in
and network-bytes-out
are two completely separate metrics, with distinctive aggregation logics. Since their implementation does not cross paths, there still exists many test-cases that aren't mergable / doesn't make sense to merge.
For example, multi bulk buffer processing
and in-line buffer processing
are specific to network-bytes-in
, and make no difference to network-bytes-out
aggregation. Another example is MULTI/EXEC
. There's no special testing for MULTI/EXEC
for per-slot network-bytes-out
, since the way COB is managed in a transaction / non-transaction is identical. There's also SPUBLISH
, which only has a special aggregation logic for network-bytes-out
, but not for network-bytes in
. While these test cases could be merged, we would be forcefully merging them for the sake of merger, and not meaningfully testing their unique behaviours.
So, even with the merger effort, some test classes are better kept separate. And thus, by merging the two test classes together, we would end with three classes; 1) network-bytes-in only, 2) network-bytes-out only, and 3) both. Where the effort originally started to reduce confusion and code duplication, now becomes more confusing for developers.
For this reason, I have slight preference towards keeping them under two separate test classes. Two test classes seem simpler than three, and a failing test in "both" test class would require further triaging to assess whether the failure occurred from network-bytes-in
or network-bytes-out
. This ambiguity is removed when no test cases are shared between multiple metrics.
That said, this is just a slight preference, and I ultimately don't have a strong objection to swing either sides. I will commit to the final decision made by the core member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline with Madelyn. Given rc1 timeline on the immediate horizon, for now, we will keep the integration tests separated.
That said, after rc1 merger, we will do a fast-follow-up to address a problem with our current testing strategy; white-box testing. For example, currently the network-bytes-out
does not have a MULTI/EXEC
test case, since we know by its implementation that COB mutation does not differ between a transaction and a non-transaction case. Obviously, there's no guarantee that this promise will be upheld forever. Our proposed fast-follow up will disregard the implementation knowledge, and write a MULTI/EXEC
test case, where all four metrics' expectations are asserted.
This way, our the composition of black-box tests and our confidence in future bug findings are improved.
We also note that not all test cases should be black-boxed, as the potential test cases are limitless. With reasonable bound, we will maintain a healthy balance between black-box / white-box tests, prioritizing black-box testing when applicable.
- Removed sharded pubsub aggregation, except for network-bytes-out internal propagation. - Added network-bytes-out in reply schema. - Added missing assertions in tcl integration tests. Signed-off-by: Kyle Kim <[email protected]>
src/cluster_slot_stats.c
Outdated
} else if (stat_type == NETWORK_BYTES_IN) { | ||
slot_stat = server.cluster->slot_stats[slot].network_bytes_in; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else if (stat_type == NETWORK_BYTES_IN) { | |
slot_stat = server.cluster->slot_stats[slot].network_bytes_in; | |
} else if (stat_type == NETWORK_BYTES_IN) { | |
slot_stat = server.cluster->slot_stats[slot].network_bytes_in; | |
} else if (stat_type == NETWORK_BYTES_OUT) { | |
slot_stat = server.cluster->slot_stats[slot].network_bytes_out; |
We might do a case switch here so that it throws a warning if we miss a type here.
@@ -511,6 +839,10 @@ start_cluster 1 0 {tags {external:skip cluster}} { | |||
# When cluster-slot-stats-enabled config is disabled, you cannot sort using advanced metrics. | |||
set orderby "cpu-usec" | |||
assert_error "ERR*" {R 0 CLUSTER SLOT-STATS ORDERBY $orderby} | |||
set orderby "network-bytes-in" | |||
assert_error "ERR*" {R 0 CLUSTER SLOT-STATS ORDERBY $orderby} | |||
set orderby "network-bytes-out" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing tests here to make sure we can order by network bytes in/out
- Bug fix on c->net_output_bytes_curr_cmd for replicationSendAck(). This value must be manually reset to zero. - Bug fix on slot stats sorting. - Added more tcl integration test cases. - Moved network-bytes-in aggregation from afterCommand() to commandProcessed() Signed-off-by: Kyle Kim <[email protected]>
The failing test-case seems flakey. Latest commit does not alter the slot migration logic. I confirm the identical test passing on my dev-machine;
|
Signed-off-by: Madelyn Olson <[email protected]>
src/cluster_slot_stats.c
Outdated
default: /* SLOT_STAT_COUNT, INVALID */ | ||
serverPanic("Invalid slot stat type %d was found.", stat_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default: /* SLOT_STAT_COUNT, INVALID */ | |
serverPanic("Invalid slot stat type %d was found.", stat_type); | |
case SLOT_STAT_COUNT: | |
case INVALID: | |
serverPanic("Invalid slot stat type %d was found.", stat_type); |
This gets the benefit of if you miss a case it will throw a warning, otherwise
- Updated minor comments. Signed-off-by: Kyle Kim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM I'm going to kickoff some tests and create some followup issues before merging though.
Doc PR is here: valkey-io/valkey-doc#150 |
Pending test run before merge: https://github.com/valkey-io/valkey/actions/runs/10118033123 |
Docs for CLUSTER SLOT-STATS, with key-count, cpu-usec, network-bytes-in, and network-bytes-out metrics. - valkey-io/valkey#351 - valkey-io/valkey#712 - valkey-io/valkey#720 --------- Signed-off-by: Kyle Kim <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Madelyn Olson <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>
Adds two new metrics for per-slot statistics, network-bytes-in and network-bytes-out. The network bytes are inclusive of replication bytes but exclude other types of network traffic such as clusterbus traffic.
network-bytes-in
The metric tracks network ingress bytes under per-slot context, by reverse calculation of
c->argv_len_sum
andc->argc
, stored under a newly introduced fieldc->net_input_bytes_curr_cmd
.network-bytes-out
The metric tracks network egress bytes under per-slot context, by hooking onto COB buffer mutations.
sample response
Both metrics are reported under the
CLUSTER SLOT-STATS
command.