Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hdr_value_at_percentile optimization. #107

Merged

Conversation

yoav-steinberg
Copy link
Contributor

@yoav-steinberg yoav-steinberg commented Mar 21, 2022

This is basically a C port of the optimization in PR HdrHistogram/hdrhistogram-go#48 by @filipecosta90.

before:

yoav@yoav-thinkpad:~/HdrHistogram_c/build$ ./test/hdr_histogram_benchmark --benchmark_filter=BM_hdr_value_at_percentile/3/86400000 --benchmark_min_time=10
2022-03-21 12:41:46
Running ./test/hdr_histogram_benchmark
Run on (12 X 4600 MHz CPU s)
CPU Caches:
  L1 Data 32K (x6)
  L1 Instruction 32K (x6)
  L2 Unified 256K (x6)
  L3 Unified 12288K (x1)
Load Average: 0.68, 0.81, 0.89
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_hdr_value_at_percentile/3/86400000         34937 ns        34955 ns       404859
BM_hdr_value_at_percentile/3/86400000000      36561 ns        36581 ns       398987

after:

yoav@yoav-thinkpad:~/HdrHistogram_c/build$ ./test/hdr_histogram_benchmark --benchmark_filter=BM_hdr_value_at_percentile/3/86400000 --benchmark_min_time=10
2022-03-21 13:36:17
Running ./test/hdr_histogram_benchmark
Run on (12 X 4600 MHz CPU s)
CPU Caches:
  L1 Data 32K (x6)
  L1 Instruction 32K (x6)
  L2 Unified 256K (x6)
  L3 Unified 12288K (x1)
Load Average: 0.86, 0.96, 1.26
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_hdr_value_at_percentile/3/86400000          5070 ns         5083 ns      2795009
BM_hdr_value_at_percentile/3/86400000000       5124 ns         5135 ns      2753732

@filipecosta90
Copy link
Contributor

great improvement @yoav-steinberg! thank you!
@mikeb01 do you think this PR can be reviewed? In a nutshell, it avoids wasteful computation on the iterator and just computes the bare minimum up until we reach the value at percentile.

@mikeb01
Copy link
Contributor

mikeb01 commented Mar 23, 2022

This definitely looks interesting. Which build configuration was used to generate the benchmarks Release or Debug. If this was Debug, it would be good to rerun them with a release build.

src/hdr_histogram.c Outdated Show resolved Hide resolved
src/hdr_histogram.c Outdated Show resolved Hide resolved
int32_t bucket_base_idx = get_bucket_base_index(h, bucket_idx);

for (;;) {
if (count_to_idx >= count_at_percentile) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about this being the only exit clause for the loop. I think the sub_bucket_idx should also be checked to make sure that it doesn't run of the end or get into an infinite loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeb01 isn't the following check and change enough?:
#107 (comment)
notice when we cross the sub-bucket limit we change bucket:

        if (sub_bucket_idx >= h->sub_bucket_count)
        {
            sub_bucket_idx = h->sub_bucket_half_count;
            bucket_idx++;
            bucket_base_idx = get_bucket_base_index(h, bucket_idx);
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeb01 I think that since count_at_percentile is always less or equal to total_count then we can safely assume that count_to_index will reach or pass count_at_percentile before the call to get_count_at_index_given_bucket_base_idx is called with an overflowing bucket_base_idx, sub_bucket_idx. So I think it's safe.
@filipecosta90 please confirm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an overflow check that'll abort early if the passed count_at_percentile is to high. Please re-review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree that with the check on https://github.com/HdrHistogram/HdrHistogram_c/pull/107/files#diff-3b9d3dbfd2aefc19e97618e6ca54cdab9ec1900faef90b1a450acad891980717R688 it's not possible to run over the end/get into infinite loops. agree @mikeb01 ?

Copy link
Contributor

@mikeb01 mikeb01 May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it works as long as all of the logic is correct and the total_count is consistent with the underlying data and nothing gets accidentally messed up. "I think" being the operative term here. My concern is that it takes a substantial mental effort and a few assumptions of behaviour to verify correctness. I would feel more confident with a simple check of the index against counts_len (the iterator move_next does this as an additional safety step). I know this means a bit of restructuring of the code, e.g. in-lining get_count_at_index_given_bucket_base_idx so the index can be checked, but I think it will be safer and less fragile in the long run.

@yoav-steinberg
Copy link
Contributor Author

Which build configuration was used...

I used Release for the project itself:
cmake -DHDR_HISTOGRAM_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release ..
but couldn't get the benchmark lib to be built with Release, hence the:
***WARNING*** Library was built as DEBUG. Timings may be affected.
But I think it's still valid.

@yoav-steinberg yoav-steinberg requested a review from mikeb01 April 11, 2022 15:28
yoav-steinberg added a commit to yoav-steinberg/redis that referenced this pull request Apr 19, 2022
oranagra pushed a commit to redis/redis that referenced this pull request Apr 20, 2022
`hdr_value_at_percentile()` is part of the Hdr_Histogram library
used when generating `latencystats` report. 

There's a pending optimization for this function which greatly
affects the performance of `info latencystats`.
HdrHistogram/HdrHistogram_c#107

This PR:
1. Upgrades the sources in _deps/hdr_histogram_ to the latest Hdr_Histogram
  version 0.11.5
2. Applies the referenced optimization.
3. Adds minor documentation about the hdr_histogram dependency which was
  missing under _deps/README.md_.

benchmark on my machine:
running: `redis-benchmark -n 100000 info latencystats` on a clean build with no data.

| benchmark | RPS |
| ---- | ---- |
| before upgrade to v0.11.05  | 7,681 |
| before optimization | 12,474 |
| after optimization | 52,606 |

Co-authored-by: filipe oliveira <[email protected]>
ncghost1 added a commit to CN-annotation-team/redis7.0-chinese-annotated that referenced this pull request Apr 29, 2022
* 7.0-RC1

* 7.0 RC2

* 7.0-RC3

* Fix RM_Yield bug processing future commands of the current client. (#10573)

RM_Yield was missing a call to protectClient to prevent redis from
processing future commands of the yielding client.

Adding tests that fail without this fix.

This would be complicated to solve since nested calls to RM_Call used to
replace the current_client variable with the module temp client.

It looks like it's no longer necessary to do that, since it was added
back in #9890 to solve two issues, both already gone:
1. call to CONFIG SET maxmemory could trigger a module hook calling
   RM_Call. although this specific issue is gone, arguably other hooks
   like keyspace notification, can do the same.
2. an assertion in lookupKey that checks the current command of the
   current client, introduced in #9572 and removed in #10248

* Fix sdsConfigRewrite() to prevent freeing invalid memory(#10598)

* Return 0 when config set out-of-range oom-score-adj-values (#10601)

When oom-score-adj-values is out of range, setConfigOOMScoreAdjValuesOption
should return 0, not -1, otherwise it will be considered as success.

* Fixes around AOF failed rewrite rate limiting (#10582)

Changes:
1. Check the failed rewrite time threshold only when we actually consider triggering a rewrite.
  i.e. this should be the last condition tested, since the test has side effects (increasing time threshold)
  Could have happened in some rare scenarios 
2. no limit in startup state (e.g. after restarting redis that previously failed and had many incr files)
3. the “triggered the limit” log would be recorded only when the limit status is returned
4. remove failure count in log (could be misleading in some cases)

Co-authored-by: chenyang8094 <[email protected]>
Co-authored-by: Oran Agra <[email protected]>

* Add comment to sdsConfigSet() (#10605)

Improve comments to explain the code

Co-authored-by: moticless <[email protected]>
Co-authored-by: Oran Agra <[email protected]>

* Tests: improve skip tags around maxmemory and resp3 (#10597)

some skip tags where missing on some tests....

* Add socket-mark-id support for marking sockets. (#10349)

Add a configuration option to attach an operating system-specific identifier to Redis sockets, supporting advanced network configurations using iptables (Linux) or ipfw (FreeBSD).

* Optimized `hdr_value_at_percentile` (#10606)

`hdr_value_at_percentile()` is part of the Hdr_Histogram library
used when generating `latencystats` report. 

There's a pending optimization for this function which greatly
affects the performance of `info latencystats`.
HdrHistogram/HdrHistogram_c#107

This PR:
1. Upgrades the sources in _deps/hdr_histogram_ to the latest Hdr_Histogram
  version 0.11.5
2. Applies the referenced optimization.
3. Adds minor documentation about the hdr_histogram dependency which was
  missing under _deps/README.md_.

benchmark on my machine:
running: `redis-benchmark -n 100000 info latencystats` on a clean build with no data.

| benchmark | RPS |
| ---- | ---- |
| before upgrade to v0.11.05  | 7,681 |
| before optimization | 12,474 |
| after optimization | 52,606 |

Co-authored-by: filipe oliveira <[email protected]>

* Stop RDB child before flushing and parsing the RDB in Diskless replication too (#10602)

We should stop RDB child in advance before flushing to reduce COW in diskless replication too.

Co-authored-by: Oran Agra <[email protected]>

* Fixes around clients that must be obeyed. Replica report disk errors in PING. (#10603)

This PR unifies all the places that test if the current client is the
master client or AOF client, and uses a method to test that on all of
these.

Other than some refactoring, these are the actual implications:
- Replicas **don't** ignore disk error when processing commands not
  coming from their master.
  **This is important for PING to be used for health check of replicas**
- SETRANGE, APPEND, SETBIT, BITFIELD don't do proto_max_bulk_len check for AOF
- RM_Call in SCRIPT_MODE ignores disk error when coming from master /
  AOF
- RM_Call in cluster mode ignores slot check when processing AOF
- Scripts ignore disk error when processing AOF
- Scripts **don't** ignore disk error on a replica, if the command comes
  from clients other than the master
- SCRIPT KILL won't kill script coming from AOF
- Scripts **don't** skip OOM check on replica if the command comes from
  clients other than the master

Note that Script, AOF, and module clients don't reach processCommand,
which is why some of the changes don't actually have any implications.

Note, reverting the change done to processCommand in 2f4240b
should be dead code due to the above mentioned fact.

* Optimization: Use either monotonic or wall-clock to measure command execution time, to regain up to 4% execution time (#10502)

In #7491 (part of redis 6.2), we started using the monotonic timer instead of mstime to measure
command execution time for stats, apparently this meant sampling the clock 3 times per command
rather than two (wince we also need the wall-clock time).
In some cases this causes a significant overhead.

This PR fixes that by avoiding the use of monotonic timer, except for the cases were we know it
should be extremely fast.
This PR also adds a new INFO field called `monotonic_clock` that shows which clock redis is using.

Co-authored-by: Oran Agra <[email protected]>

* fix typo in "lcsCommand" doc comment (#10622)

fix typo. `LCS[j+(blen+1)*j]` -> `LCS[j+(blen+1)*i]`

* Fix typo in function name "harndfieldReplyWithListpack" to "hrandfieldReplyWithListpack" (#10623)

* Fix timing issue in slowlog redact test (#10614)

* Fix timing issue in slowlog redact test

This test failed once in my daily CI (test-sanitizer-address (clang))
```
*** [err]: SLOWLOG - Some commands can redact sensitive fields in tests/unit/slowlog.tcl
Expected 'migrate 127.0.0.1 25649 key 9 5000 AUTH2 (redacted) (redacted)' to match '* key 9 5000 AUTH (redacted)' (context: type eval line 12 cmd {assert_match {* key 9 5000 AUTH (redacted)} [lindex [lindex [r slowlog get] 1] 3]} proc ::test)
```

The reason is that with slowlog-log-slower-than 10000,
slowlog get will have a chance to exceed 10ms.

Change slowlog-log-slower-than from 10000 to -1, disable it.
Also handles a same potentially problematic test above.
This is actually the same timing issue as #10432.

But also avoid repeated calls to `SLOWLOG GET`

* isSafeToPerformEvictions: Remove redundant condition (#10610)

If was first added in #9890 to solve the problem of
CONFIG SET maxmemory causing eviction inside MULTI/EXEC,
but that problem is already fixed (CONFIG SET doesn't evict
directly, it just schedules a later eviction)

Keep that condition may hide bugs (i.e. performEvictions
should always expect to have an empty server.also_propagate)

* Run large-memory tests as solo. (#10626)

This avoids random memory spikes and enables --large-memory tests to run
on moderately sized systems.

* Test: RM_Call from within "expired" notification (#10613)

This case is interesting because it originates from cron,
rather than from another command.

The idea came from looking at #9890 and #10573, and I was wondering if RM_Call
would work properly when `server.current_client == NULL`

* Fix regression not aborting transaction on error, and re-edit some error responses (#10612)

1. Disk error and slave count checks didn't flag the transactions or counted correctly in command stats (regression from #10372  , 7.0 RC3)
2. RM_Call will reply the same way Redis does, in case of non-exisitng command or arity error
3. RM_WrongArtiy will consider the full command name
4. Use lowercase 'u' in "unknonw subcommand" (to align with "unknown command")

Followup work of #10127

* fix an unclear comment and add  a comment to 'zi' in 'quicklist.h' (#10633)

fix an unclear comment quicklist container formats to quicklist node container formats
Add a comment to 'zi' in quicklistIter (Where it first appeared)
Why do I add a comment to zi:
Because it is not a variable name with a clear meaning, and its name seems to be from the deprecated ziplist.

* Fix typos and limit unknown command error message (#10634)

minor cleanup for recent changes.

* fix broken protocol regression from #10612 (#10639)

A change in #10612 introduced a regression.
when replying with garbage bytes to the caller, we must make sure it
doesn't include any newlines.

in the past it called rejectCommandFormat which did that trick.
but now it calls rejectCommandSds, which doesn't, so we need to make sure
to sanitize the sds.

* By default prevent cross slot operations in functions and scripts with # (#10615)

Adds the `allow-cross-slot-keys` flag to Eval scripts and Functions to allow
scripts to access keys from multiple slots.
The default behavior is now that they are not allowed to do that (unlike before).
This is a breaking change for 7.0 release candidates (to be part of 7.0.0), but
not for previous redis releases since EVAL without shebang isn't doing this check.

Note that the check is done on both the keys declared by the EVAL / FCALL command
arguments, and also the ones used by the script when making a `redis.call`.

A note about the implementation, there seems to have been some confusion
about allowing access to non local keys. I thought I missed something in our
wider conversation, but Redis scripts do block access to non-local keys.
So the issue was just about cross slots being accessed.

* Set replicas to panic on disk errors, and optionally panic on replication errors (#10504)

* Till now, replicas that were unable to persist, would still execute the commands
  they got from the master, now they'll panic by default, and we add a new
  `replica-ignore-disk-errors` config to change that.
* Till now, when a command failed on a replica or AOF-loading, it only logged a
  warning and a stat, we add a new `propagation-error-behavior` config to allow
  panicking in that state (may become the default one day)

Note that commands that fail on the replica can either indicate a bug that could
cause data inconsistency between the replica and the master, or they could be
in some cases (specifically in previous versions), a result of a command (e.g. EVAL)
that failed on the master, but still had to be propagated to fail on the replica as well.

* Fix syntax error in replicationErrorBehavior enum (#10642)

Missing a typeof, we will get errors like this:
- multiple definition of `replicationErrorBehavior'
- ld: error: duplicate symbol: replicationErrorBehavior

Introduced in #10504

* Allow configuring signaled shutdown flags (#10594)

The SHUTDOWN command has various flags to change it's default behavior,
but in some cases establishing a connection to redis is complicated and it's easier
for the management software to use signals. however, so far the signals could only
trigger the default shutdown behavior.
Here we introduce the option to control shutdown arguments for SIGTERM and SIGINT.

New config options:
`shutdown-on-sigint [nosave | save] [now] [force]` 
`shutdown-on-sigterm [nosave | save] [now] [force]`

Implementation:
Support MULTI_ARG_CONFIG on createEnumConfig to support multiple enums to be applied as bit flags.

Co-authored-by: Oran Agra <[email protected]>

* Fix bug when AOF enabled after startup. put the new incr file in the manifest only when AOFRW is done. (#10616)

Changes:

- When AOF is enabled **after** startup, the data accumulated during `AOF_WAIT_REWRITE`
  will only be stored in a temp INCR AOF file. Only after the first AOFRW is successful, we will
  add it to manifest file.
  Before this fix, the manifest referred to the temp file which could cause a restart during that
  time to load it without it's base.
- Add `aof_rewrites_consecutive_failures` info field for  aofrw limiting implementation.

Now we can guarantee that these behaviors of MP-AOF are the same as before (past redis releases):
- When AOF is enabled after startup, the data accumulated during `AOF_WAIT_REWRITE` will only
  be stored in a visible place. Only after the first AOFRW is successful, we will add it to manifest file.
- When disable AOF, we did not delete the AOF file in the past so there's no need to change that
  behavior now (yet).
- When toggling AOF off and then on (could be as part of a full-sync), a crash or restart before the
  first rewrite is completed, would result with the previous version being loaded (might not be right thing,
  but that's what we always had).

* Add module API flag for using enum configs as bit flags (#10643)

Enables registration of an enum config that'll let the user pass multiple keywords that
will be combined with `|` as flags into the integer config value.

```
    const char *enum_vals[] = {"none", "one", "two", "three"};
    const int int_vals[] = {0, 1, 2, 4};

    if (RedisModule_RegisterEnumConfig(ctx, "flags", 3, REDISMODULE_CONFIG_DEFAULT | REDISMODULE_CONFIG_BITFLAGS, enum_vals, int_vals, 4, getFlagsConfigCommand, setFlagsConfigCommand, NULL, NULL) == REDISMODULE_ERR) {
        return REDISMODULE_ERR;
    }
```
doing:
`config set moduleconfigs.flags "two three"` will result in 6 being passed to`setFlagsConfigCommand`.

* Added support for Lua readonly tables.

The new feature can be turned off and on using the new `lua_enablereadonlytable` Lua API.

* Move user eval function to be located on Lua registry.

Today, Redis wrap the user Lua code with a Lua function.
For example, assuming the user code is:

```
return redis.call('ping')
```

The actual code that would have sent to the Lua interpreter was:

```
f_b3a02c833904802db9c34a3cf1292eee3246df3c() return redis.call('ping') end
```

The wraped code would have been saved on the global dictionary with the
following name: `f_<script sha>` (in our example `f_b3a02c833904802db9c34a3cf1292eee3246df3c`).

This approach allows one user to easily override the implementation a another user code, example:

```
f_b3a02c833904802db9c34a3cf1292eee3246df3c = function() return 'hacked' end
```

Running the above code will cause `evalsha b3a02c833904802db9c34a3cf1292eee3246df3c 0` to return
hacked although it should have returned `pong`.

Another disadventage is that Redis basically runs code on the loading (compiling) phase without been
aware of it. User can do code injection like this:

```
return 1 end <run code on compling phase> function() return 1
```

The wraped code will look like this and the entire `<run code on compling phase>` block will run outside
of eval or evalsha context:

```
f_<sha>() return 1 end <run code on compling phase> function() return 1 end
```

* Protect globals of both evals scripts and functions.

Use the new `lua_enablereadonlytable` Lua API to protect the global tables of
both evals scripts and functions. For eval scripts, the implemetation is easy,
We simply call `lua_enablereadonlytable` on the global table to turn it into
a readonly table.

On functions its more complecated, we want to be able to switch globals between
load run and function run. To achieve this, we create a new empty table that
acts as the globals table for function, we control the actual globals using metatable
manipulation. Notice that even if the user gets a pointer to the original tables, all
the tables are set to be readonly (using `lua_enablereadonlytable` Lua API) so he can
not change them. The following inlustration better explain the solution:

```
Global table {} <- global table metatable {.__index = __real_globals__}
```

The `__real_globals__` is set depends on the run context (function load or function call).

Why this solution is needed and its not enough to simply switch globals?
When we run in the context of function load and create our functions, our function gets
the current globals that was set when they were created. Replacing the globals after
the creation will not effect them. This is why this trick it mandatory.

* Protect any table which is reachable from globals and added globals white list.

The white list is done by setting a metatable on the global table before initializing
any library. The metatable set the `__newindex` field to a function that check
the white list before adding the field to the table. Fields which is not on the
white list are simply ignored.

After initialization phase is done we protect the global table and each table
that might be reachable from the global table. For each table we also protect
the table metatable if exists.

* Delete renamed new incr when write manifest failed (#10649)

Followup fix for #10616

* Redis 7.0.0 GA

Co-authored-by: Oran Agra <[email protected]>
Co-authored-by: Moti Cohen <[email protected]>
Co-authored-by: sundb <[email protected]>
Co-authored-by: judeng <[email protected]>
Co-authored-by: chenyang8094 <[email protected]>
Co-authored-by: moticless <[email protected]>
Co-authored-by: David CARLIER <[email protected]>
Co-authored-by: yoav-steinberg <[email protected]>
Co-authored-by: filipe oliveira <[email protected]>
Co-authored-by: menwen <[email protected]>
Co-authored-by: Lu JJ <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: guybe7 <[email protected]>
Co-authored-by: Yossi Gottlieb <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Eduardo Semprebon <[email protected]>
Co-authored-by: meir <[email protected]>
Co-authored-by: Tabi no Tochuu <[email protected]>
I also optimized the function a bit to reduce the performeance overhead of the bounds check.
@yoav-steinberg
Copy link
Contributor Author

yoav-steinberg commented May 18, 2022

@mikeb01 @filipecosta90 I refactored and added an explicit counts bounds check as requested for extra safety. Note that this entailed a slight performance hit, so I optimized a bit more to get back to the previous performance. I'd appreciate you verify that I didn't break the logic.
Thanks.

@mikeb01 mikeb01 merged commit 84405cb into HdrHistogram:master May 20, 2022
@mikeb01
Copy link
Contributor

mikeb01 commented May 20, 2022

I've merged this, but rewrote the loop to match the implementation in the Java version. The performance measured on my local machine was within 1%.

yoav-steinberg added a commit to yoav-steinberg/redis that referenced this pull request May 20, 2022
…#10606.

The code is based on upstream https://github.com/HdrHistogram/HdrHistogram_c master branch latest commit (e4448cf6d1cd08fff519812d3b1e58bd5a94ac42). The reason to pull this in now is that their final version of our optimization is even faster. See: HdrHistogram/HdrHistogram_c#107.
oranagra pushed a commit to redis/redis that referenced this pull request May 22, 2022
… (#10755)

The code is based on upstream https://github.com/HdrHistogram/HdrHistogram_c
master branch latest commit (e4448cf6d1cd08fff519812d3b1e58bd5a94ac42).
The reason to pull this in now is that their final version of our optimization is even faster.
See: HdrHistogram/HdrHistogram_c#107.
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
`hdr_value_at_percentile()` is part of the Hdr_Histogram library
used when generating `latencystats` report. 

There's a pending optimization for this function which greatly
affects the performance of `info latencystats`.
HdrHistogram/HdrHistogram_c#107

This PR:
1. Upgrades the sources in _deps/hdr_histogram_ to the latest Hdr_Histogram
  version 0.11.5
2. Applies the referenced optimization.
3. Adds minor documentation about the hdr_histogram dependency which was
  missing under _deps/README.md_.

benchmark on my machine:
running: `redis-benchmark -n 100000 info latencystats` on a clean build with no data.

| benchmark | RPS |
| ---- | ---- |
| before upgrade to v0.11.05  | 7,681 |
| before optimization | 12,474 |
| after optimization | 52,606 |

Co-authored-by: filipe oliveira <[email protected]>
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
…#10606. (redis#10755)

The code is based on upstream https://github.com/HdrHistogram/HdrHistogram_c
master branch latest commit (e4448cf6d1cd08fff519812d3b1e58bd5a94ac42).
The reason to pull this in now is that their final version of our optimization is even faster.
See: HdrHistogram/HdrHistogram_c#107.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants