Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crisis: asserting crisis invariants distribution/can-withdraw takes a very long time #9876

Closed
1 of 4 tasks
odeke-em opened this issue Aug 8, 2021 · 16 comments
Closed
1 of 4 tasks
Labels
T: Performance Performance improvements

Comments

@odeke-em
Copy link
Collaborator

odeke-em commented Aug 8, 2021

From my profiling and sleuthing, I just noticed that starting a node using gaiad took more than 30 minutes and majority of the time was spent on module=x/crisis name=distribution/can-withdraw per

$ gaiad start --p2p.seeds "[email protected]:26657"
12:34AM INF starting ABCI with Tendermint
12:34AM INF Starting multiAppConn service impl=multiAppConn module=proxy
12:34AM INF Starting localClient service connection=query impl=localClient module=abci-client
12:34AM INF Starting localClient service connection=snapshot impl=localClient module=abci-client
12:34AM INF Starting localClient service connection=mempool impl=localClient module=abci-client
12:34AM INF Starting localClient service connection=consensus impl=localClient module=abci-client
12:34AM INF Starting EventBus service impl=EventBus module=events
12:34AM INF Starting PubSub service impl=PubSub module=pubsub
12:34AM INF Starting IndexerService service impl=IndexerService module=txindex
12:34AM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=
12:34AM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0
12:34AM INF asserting crisis invariants inv=0/12 module=x/crisis name=bank/nonnegative-outstanding
12:34AM INF asserting crisis invariants inv=1/12 module=x/crisis name=bank/total-supply
12:34AM INF asserting crisis invariants inv=2/12 module=x/crisis name=distribution/nonnegative-outstanding
12:34AM INF asserting crisis invariants inv=3/12 module=x/crisis name=distribution/can-withdraw
1:07AM INF asserting crisis invariants inv=4/12 module=x/crisis name=distribution/reference-count
1:07AM INF asserting crisis invariants inv=5/12 module=x/crisis name=distribution/module-account
1:07AM INF asserting crisis invariants inv=6/12 module=x/crisis name=liquidity/escrow-amount
1:07AM INF asserting crisis invariants inv=7/12 module=x/crisis name=staking/module-accounts
1:07AM INF asserting crisis invariants inv=8/12 module=x/crisis name=staking/nonnegative-power
1:07AM INF asserting crisis invariants inv=9/12 module=x/crisis name=staking/positive-delegation
1:07AM INF asserting crisis invariants inv=10/12 module=x/crisis name=staking/delegator-shares
1:08AM INF asserting crisis invariants inv=11/12 module=x/crisis name=gov/module-account
1:08AM INF asserted all invariants duration=2065579.553016 height=5200791 module=x/crisis
1:08AM INF created new capability module=ibc name=ports/transfer
1:08AM INF port binded module=x/ibc/port port=transfer
1:08AM INF claimed capability capability=1 module=transfer name=ports/transfer

In late February/March 2021 we had brought down start time to a very short period, but this is a huge increase. We need to investigate what's going wrong.

I've examined live traffic for the past 1.5 hours and a pain point seems to be the expected
Screen Shot 2021-08-08 at 1 35 11 AM

I know there has been talk of replacing the code in iavl, which is why my team and I held from from optimizing it, but it's been 6+ months and perhaps it is time for us to dive in deep and figure out what could be improved.

/cc @cuonglm @kirbyquerby


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@odeke-em odeke-em added the T: Performance Performance improvements label Aug 8, 2021
@cuonglm
Copy link
Contributor

cuonglm commented Aug 9, 2021

@odeke-em which gaia version did you use?

@alexanderbez
Copy link
Contributor

alexanderbez commented Aug 9, 2021

Ooooo yes, this is a very well known issue to the point we encourage operators to start with --skip-crisis-invariants. But that really defeats the purpose of having x/crisis as a module/function, so we should really strive to improve the performance of this. But I imagine not only the problem but also the improvements will be multi-faceted.

Also, it is true we're moving to SMT in place of IAVL, at that work is very active. I don't know estimated completion times (cc @aaronc), so based on the provided benchmark it seems to be completely IAVL related. Not sure if it's worth digging into this...?

@robert-zaremba
Copy link
Collaborator

It will take few more months to have ADR-40 implemented and merged.
IAVL and current store package won't be deprecated immediately and we will support both SMT and IAVL in the SDK for few more releases.

@odeke-em
Copy link
Collaborator Author

Awesome, thank you @alexanderbez and @robert-zaremba for the go-ahead that we've been waiting for cosmos/iavl, for a while. We've finished auditing celestiaorg/smt (publishing the official report in a few hours) and now we can go forth and improving cosmos/iavl and cosmos/cosmos-sdk/store/* performance.

@alexanderbez
Copy link
Contributor

alexanderbez commented Aug 16, 2021

Given what @robert-zaremba stated and if you have bandwidth and interest, it would be helpful to solve this or at least figure out where the major bottlenecks are @odeke-em 🙏

@odeke-em
Copy link
Collaborator Author

Given what @robert-zaremba stated and if you have bandwidth and interest, it would be helpful to solve this or at least figure out where the major bottlenecks are @odeke-em 🙏

Roger that @alexanderbez, we are on it and yes we do have the bandwidth to work on it :-)

@odeke-em
Copy link
Collaborator Author

Eureka! After experimenting this afternoon and evening with various tricks, I think I have an idea that'll definitely slash x/crisis time down from 30+ minutes to like less than a minute or so.

Prognosis

In hundreds of profiles, we see that this code is super hot and the cause of that delay

for key := range store.unsortedCache {
if dbm.IsKeyInDomain(conv.UnsafeStrToBytes(key), start, end) {
cacheValue := store.cache[key]
unsorted = append(unsorted, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
}
}

per

ROUTINE ======================== github.com/cosmos/cosmos-sdk/store/cachekv.(*Store).dirtyItems in github.com/cosmos/[email protected]/store/cachekv/store.go
     1.42s     35.02s (flat, cum) 60.06% of Total
         .          .    211:func (store *Store) dirtyItems(start, end []byte) {
         .          .    212:	unsorted := make([]*kv.Pair, 0)
         .          .    213:
         .          .    214:        startStr, endStr := byteSliceToStr(start), byteSliceToStr(end)
         .          .    215:	n := len(store.unsortedCache)
     240ms      4.67s    216:        for key := range store.unsortedCache {
     430ms      430ms    217:            if key == "" {
         .          .    218:                println("blank key")
         .          .    219:            }
         .          .    220:        }
     740ms         5s    221:	for key := range store.unsortedCache {
      10ms     24.79s    222:		if isKeyInDomain(key, startStr, endStr) {
         .       10ms    223:			cacheValue := store.cache[key]
         .       10ms    224:			unsorted = append(unsorted, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
         .          .    225:		}
         .          .    226:	}
         .          .    227:
         .          .    228:	if len(unsorted) == n { // This pattern allows the Go compiler to emit the map clearing idiom for the entire map.
         .          .    229:		for key := range store.unsortedCache {
         .          .    230:			delete(store.unsortedCache, key)
         .          .    231:		}
         .          .    232:	} else { // Otherwise, normally delete the unsorted keys from the map.
         .          .    233:		for _, kv := range unsorted {
         .       10ms    234:			delete(store.unsortedCache, byteSliceToStr(kv.Key))
         .          .    235:		}
         .          .    236:	}
         .          .    237:
         .       50ms    238:	sort.Slice(unsorted, func(i, j int) bool {
         .       50ms    239:		return bytes.Compare(unsorted[i].Key, unsorted[j].Key) < 0
         .          .    240:	})
         .          .    241:
         .          .    242:	for e := store.sortedCache.Front(); e != nil && len(unsorted) != 0; {
         .          .    243:		uitem := unsorted[0]
         .          .    244:		sitem := e.Value

and the culprit is the byte comparisons from invoking dbm.IsKeyInDomain(start, end, key []byte) per https://github.com/tendermint/tm-db/blob/edb35ebc2ea9327e0c0528d0d66bbf39a787461b/util.go#L37-L46, where it essentially for every single key performs a search in unsorted keys, so O(n) in the number of unsorted keys, but then O(klm*n) in the length of start, end and each of the keys, which leads to the huge 30min plus time spent looking for keys within the domain.

Target

We literally want to look through all the unsorted keys and find the values within the range [start, end)

Suggestion

Instead of that chaotic and quadratic search, we can bring this down to somewhat linear time by:
a) Sorting all the store.unsortedCache in a slice SL
b) Using a modified binary search to find where in SL we have the first keys >= start, eStart
c) Using a modified range binary search to find where in SL we have the first value less than end, eEnd
d) Iterating just using those indices eStart and eEnd ranged over SL to retrieve the dirty items in range

From a preliminary experiment of walking through the logic and plugging it into store.go I can confirm that this will return results in a very short time. I am going to catch some sleep for now, but I'll send a PR in the morning and also test out speed to reconfirm

mergify bot pushed a commit that referenced this issue Oct 14, 2021
…binary searches (#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue #10023

Updates #9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
mergify bot pushed a commit that referenced this issue Oct 14, 2021
…binary searches (#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue #10023

Updates #9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944)

# Conflicts:
#	CHANGELOG.md
robert-zaremba added a commit that referenced this issue Oct 15, 2021
…binary searches (backport #10024) (#10370)

* fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue #10023

Updates #9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: Emmanuel T Odeke <[email protected]>
Co-authored-by: marbar3778 <[email protected]>
Co-authored-by: Robert Zaremba <[email protected]>
evan-forbes pushed a commit to evan-forbes/cosmos-sdk that referenced this issue Nov 1, 2021
…binary searches (backport cosmos#10024) (cosmos#10370)

* fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue cosmos#10023

Updates cosmos#9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: Emmanuel T Odeke <[email protected]>
Co-authored-by: marbar3778 <[email protected]>
Co-authored-by: Robert Zaremba <[email protected]>
zakir-code pushed a commit to PundiAI/cosmos-sdk that referenced this issue Apr 14, 2022
…binary searches (backport cosmos#10024) (cosmos#10370)

* fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue cosmos#10023

Updates cosmos#9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: Emmanuel T Odeke <[email protected]>
Co-authored-by: marbar3778 <[email protected]>
Co-authored-by: Robert Zaremba <[email protected]>
@tac0turtle
Copy link
Member

@odeke-em are you still interested in working on this?

@odeke-em
Copy link
Collaborator Author

@odeke-em are you still interested in working on this?

I did mail this PR #10024 but I was told the store backend was gonna change in a couple of months to use smt, so I didn't do anything else aside from auditing smt and performance improvements on it, but waited for it to land here. Has the transition been completed?

@tac0turtle
Copy link
Member

I did mail this PR #10024 but I was told the store backend was gonna change in a couple of months to use smt, so I didn't do anything else aside from auditing smt and performance improvements on it, but waited for it to land here. Has the transition been completed?

sorry, I think this is still relevant and should be done. The work on store v2 is taking longer than expected

@odeke-em
Copy link
Collaborator Author

odeke-em commented Aug 3, 2022

@elias-orijtech my awesome colleague from Orijtech Inc is already on this issue and we are going to work on it, I'll join in but just been super busy wrapping up quarter things but also on other security and performance issues and fixes.

@elias-orijtech
Copy link
Contributor

I've built gaiad, initialized a main net node, downloaded a Quicksync pruned snapshot, unpacked it. However, I'm not sure how to reproduce the error, or maybe I have to wait for something to settle. Here's my log (@odeke-em's p2p peer isn't available anymore):

$ ./build/gaiad start --cpu-profile cpu2.prof
8:08PM INF starting CPU profiler profile=cpu2.prof
8:08PM INF starting node with ABCI Tendermint in-process
8:08PM INF service start impl=multiAppConn module=proxy msg={}
8:08PM INF service start connection=query impl=localClient module=abci-client msg={}
8:08PM INF service start connection=snapshot impl=localClient module=abci-client msg={}
8:08PM INF service start connection=mempool impl=localClient module=abci-client msg={}
8:08PM INF service start connection=consensus impl=localClient module=abci-client msg={}
8:08PM INF service start impl=EventBus module=events msg={}
8:08PM INF service start impl=PubSub module=pubsub msg={}
8:08PM INF service start impl=IndexerService module=txindex msg={}
8:08PM INF ABCI Handshake App Info hash="�{�������Ԏ+s�-TKV\x03L|��6}��|��qp" height=11547643 module=consensus protocol-version=0 software-version=main-b021d36a89c9515d87032aa3ec5f4e9633b3604f
8:08PM INF ABCI Replay Blocks appHeight=11547643 module=consensus stateHeight=11547643 storeHeight=11547643
8:08PM INF Completed ABCI Handshake - Tendermint and App are synced appHash="�{�������Ԏ+s�-TKV\x03L|��6}��|��qp" appHeight=11547643 module=consensus
8:08PM INF Version info block=11 p2p=8 tendermint_version=v0.34.20
8:08PM INF This node is not a validator addr=70FB853D315BBA61F6AA880C2AF819CD40A3AEED module=consensus pubKey=ZgHBgP6svB/4NeA3fogGHxNhQlmSRXNxykcPc3cM518=
8:08PM INF P2P Node ID ID=328839a6bda98a9cd01a3fd5649e76fdb2bfb3b8 file=/Users/a/.gaia/config/node_key.json module=p2p
8:08PM INF Adding persistent peers addrs=[] module=p2p
8:08PM INF Adding unconditional peer ids ids=[] module=p2p
8:08PM INF Add our address to book addr={"id":"328839a6bda98a9cd01a3fd5649e76fdb2bfb3b8","ip":"0.0.0.0","port":26656} book=/Users/a/.gaia/config/addrbook.json module=p2p
8:08PM INF service start impl=Node msg={}
8:08PM INF Starting pprof server laddr=localhost:6060
8:08PM INF serve module=rpc-server msg={}
8:08PM INF service start impl="P2P Switch" module=p2p msg={}
8:08PM INF service start impl=StateSync module=statesync msg={}
8:08PM INF service start impl=PEX module=pex msg={}
8:08PM INF service start book=/Users/a/.gaia/config/addrbook.json impl=AddrBook module=p2p msg={}
8:08PM INF service start impl=BlockchainReactor module=blockchain msg={}
8:08PM INF service start impl=BlockPool module=blockchain msg={}
8:08PM INF service start impl=ConsensusReactor module=consensus msg={}
8:08PM INF Reactor  module=consensus waitSync=true
8:08PM INF service start impl=Evidence module=evidence msg={}
8:08PM INF Saving AddrBook to file book=/Users/a/.gaia/config/addrbook.json module=p2p size=0

My profile so far:

pprof001

which doesn't seem related.

@visualbasic6
Copy link

visualbasic6 commented Sep 26, 2022

this issue is still present in gaiad-v7.0.3 - it's been over a year and i don't think it's been patched

failmode

@elias-orijtech
Copy link
Contributor

I'd love to take another look. However, so far I failed to reproduce the issue. Can you provide me with instructions to replicate the slow startup?

@tac0turtle
Copy link
Member

I believe this is fixed in the latest release of 0.45.x. Gaia has not updated to it yet though

@tac0turtle
Copy link
Member

this showed up in Juno and this PR solved it (#12885), closing for now unless the newer versions still have the issue

JeancarloBarrios pushed a commit to agoric-labs/cosmos-sdk that referenced this issue Sep 28, 2024
…binary searches (backport cosmos#10024) (cosmos#10370)

* fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue cosmos#10023

Updates cosmos#9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: Emmanuel T Odeke <[email protected]>
Co-authored-by: marbar3778 <[email protected]>
Co-authored-by: Robert Zaremba <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: Performance Performance improvements
Projects
None yet
Development

No branches or pull requests

7 participants