Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle underflow condition of network out slot stats metric #840

Merged
merged 4 commits into from
Jul 30, 2024

Conversation

hpatro
Copy link
Collaborator

@hpatro hpatro commented Jul 29, 2024

Fixes test failure (https://github.com/valkey-io/valkey/actions/runs/10146979329/job/28056316421?pr=837) on 32 bit system for slot stats metric underflow on the following condition:

server.cluster->slot_stats[c->slot].network_bytes_out += (len * listLength(server.replicas));
  • Here listLength accesses len which is of unsigned long type and multiplied with len (which could be negative). This is a risky operation and behaves differently based on the architecture.
clusterSlotStatsAddNetworkBytesOutForReplication(-sdslen(selectcmd->ptr));
  • sdslen method returns size_t. applying - operation to decrement network bytes out is also incorrect.

This change adds assertion on len being negative and handles the wrapping of overall value.

Scenario:

Test code:

#include<stdio.h>
#include<stdint.h>

int main(int argc, char argv[]) {
	uint64_t test = 0;
	unsigned long len = 1;
	test += 23;
	test += (-23 * len);
	printf("%llu", test);
	return 0;
}

64 bit build

$ gcc test.c
test.c: In function ‘main’:
test.c:9:20: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’ {aka ‘long unsigned int’} [-Wformat=]
    9 |         printf("%llu", test);
      |                 ~~~^   ~~~~
      |                    |   |
      |                    |   uint64_t {aka long unsigned int}
      |                    long long unsigned int
      |                 %lu
$ ./a.out
0

32 bit build:

$ gcc -m32 test.c
$ ./a.out
4294967296

@hpatro hpatro requested a review from madolson July 29, 2024 23:39
@hpatro
Copy link
Collaborator Author

hpatro commented Jul 29, 2024

Related #720

src/cluster_slot_stats.c Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.41%. Comparing base (b4d96ca) to head (029f320).

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #840      +/-   ##
============================================
+ Coverage     70.38%   70.41%   +0.02%     
============================================
  Files           112      112              
  Lines         61462    61467       +5     
============================================
+ Hits          43261    43280      +19     
+ Misses        18201    18187      -14     
Files Coverage Δ
src/cluster_slot_stats.c 94.01% <100.00%> (+0.18%) ⬆️
src/replication.c 87.30% <100.00%> (+0.13%) ⬆️

... and 10 files with indirect coverage changes

@madolson madolson added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jul 30, 2024
hpatro added 3 commits July 30, 2024 01:39
Signed-off-by: Harkrishn Patro <[email protected]>
Signed-off-by: Harkrishn Patro <[email protected]>
@madolson madolson merged commit aaa7834 into valkey-io:unstable Jul 30, 2024
49 of 56 checks passed
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 22, 2024
…o#840)

Fixes test failure
(https://github.com/valkey-io/valkey/actions/runs/10146979329/job/28056316421?pr=837)
on 32 bit system for slot stats metric underflow on the following
condition:

```
server.cluster->slot_stats[c->slot].network_bytes_out += (len * listLength(server.replicas));
```
* Here listLength accesses `len` which is of `unsigned long` type and
multiplied with `len` (which could be negative). This is a risky
operation and behaves differently based on the architecture.

```
clusterSlotStatsAddNetworkBytesOutForReplication(-sdslen(selectcmd->ptr));
```
* `sdslen` method returns `size_t`. applying `-` operation to decrement
network bytes out is also incorrect.

This change adds assertion on `len` being negative and handles the
wrapping of overall value.

---------

Signed-off-by: Harkrishn Patro <[email protected]>
Signed-off-by: mwish <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants